[PATCH] sockaddr.3type: BUGS: Document that libc should be fixed using a union

Alejandro Colomar alx.manpages@gmail.com
Mon Feb 6 11:55:12 GMT 2023


Hi Xi,

On 2/6/23 07:02, Xi Ruoyao wrote:
> On Sun, 2023-02-05 at 16:31 +0100, Alejandro Colomar via Libc-alpha wrote:
> 
>> The only correct way to use  different  types  in  an  API  is
>> through  a  union.
> 
> I don't think this statement is true (in general).  Technically we can
> write something like this:
> 
> struct sockaddr { ... };
> struct sockaddr_in { ... };
> struct sockaddr_in6 { ... };
> 
> int bind(int fd, const struct sockaddr *addr, socklen_t addrlen)
> {
>      if (addrlen < sizeof(struct sockaddr) {
>          errno = EINVAL;
>          return -1;
>      }
> 
>      /* cannot use "addr->sa_family" directly: it will be an UB */
>      sa_family_t sa_family;
>      memcpy(&sa_family, addr, sizeof(sa_family));
> 
>      switch (sa_family) {
>          case AF_INET:
>              return _do_bind_in(fd, (struct sockaddr_in *)addr, addrlen);
>          case AF_INET6:
>              return _do_bind_in6(fd, (struct sockaddr_in6 *)addr, addrlen);
>          /* more cases follow here */
>          default:
>              errno = EINVAL;
>              return -1;
>          }
>      }
> }
> 
> In this way we can use sockaddr_{in,in6,...} for bind() safely, as long
> as we can distinguish the "real" type of addr using the leading byte
> sequence (and the caller uses it carefully).

True; I hadn't thought of memcpy()ing the first member of the struct.  That's 
valid; overcomplicated but valid.

> 
> But obviously sockaddr_storage can't be distinguished here, so casting a
> struct sockaddr_stroage * to struct sockaddr * and passing it to bind()
> will still be wrong (unless we make sockaddr_storage an union or add
> [[gnu::may_alias]]).

But as you say, it still leaves us with a question.  What should one declare for 
passing to the standard APIs?  It can only be a union.  So we can either tell 
users to each create their own union, or we can make sockaddr_storage be a 
union.  The latter slightly violates POSIX due to namespaces as Rich noted, but 
that's a minor violation, and POSIX can be changed to accomodate for that.

If we change sockaddr_storage to be a union, we have two benefits:

-  Old code which uses sockaddr_storage is made conforming (non-UB) without 
modifying the source.
-  Users can inspect the structures.

If we don't, and deprecate sockaddr_storage, we should tell users to declare 
their own unions _and_ treat all these structures as black boxes which can only 
be read by memcpy()ing their contents.

Which of the two do we want?  I think fixing sockaddr_storage is simpler, and 
allows existing practice of reading these structures.  The other one just makes 
(or rather acknowledges, since it has always been like that) a lot of existing 
code invoke UB, and acknowledges that you can't safely use these structures 
without a lot of workarounding.

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://sourceware.org/pipermail/libc-alpha/attachments/20230206/d90ff414/attachment-0001.sig>


More information about the Libc-alpha mailing list