[PATCH] sockaddr.3type: BUGS: Document that libc should be fixed using a union

Zack Weinberg zack@owlfolio.org
Mon Feb 6 17:21:03 GMT 2023


On Mon, Feb 6, 2023, at 9:11 AM, Alejandro Colomar via Libc-alpha wrote:
> On 2/6/23 14:38, Rich Felker wrote:
>> There is absolutely not any need to declare the union for application
>> code calling the socket APIs. You declare whatever type you will be
>> using. For binding or connecting a unix socket, sockaddr_un. For IPv6,
>> sockaddr_in6. Etc. Then you cast the pointer to struct sockaddr * and
>> pass it to the function.
>
> Except that you may be using generic code that may use either of AF_UNIX, 
> AF_INET, and AF_INET6.  A web server may very well use all the three.

I have personally tripped over systems where struct sockaddr_un was _bigger_
than struct sockaddr_storage.  Also, AFAIK modern kernels (not just Linux)
do not actually impose a 108-byte (or whatever) limit on the length of
sun_path; application code can treat the structure definition as being

   struct sockaddr_un {
       sa_family_t sun_family;
       char sun_path[];  // C99 flexible array member
   };

as long as the `addrlen` parameter to whatever system call you're using
accurately reflects the size of the address object you passed in.  Kind
of the same as how you can make your own bigger fd_set to call select()
with, if you want.  Point being, even if sockaddr_storage is bigger than
the _default_ sockaddr_un, that still might not be big enough.

I'd also like to point out that none of these structures can change size
without breaking ABI compatibility.  In particular, namespace issues
aside, glibc _cannot_ make the definition of struct sockaddr be either

    struct sockaddr {
        sa_family_t sa_family;
    };

or

    struct sockaddr {
        union {
            // ...
            struct sockaddr_in6 in6;
        };
    };

because a variable declaration `struct sockaddr sa;` must allocate 16
bytes of space -- no more and no less.

> However, there are some APIs that require you to allocate an object.  For 
> example recvfrom(2).  How would you recommend using recvfrom(2)

Well, most address families have fixed-size addresses: if you called
socket(AF_INET6, SOCK_DGRAM, 0) then you know recvfrom on that socket
needs enough space for struct sockaddr_in6.  If you receive a socket
descriptor as an argument and you don't know its address family, you
can use `getsockopt(sock, SOL_SOCKET, SO_DOMAIN)` to look it up.

This won't work for AF_UNIX, though.  recvfrom() _will_ tell you if
you didn't give it enough space (by updating `addrlen` to a bigger
number than you passed in); it's not idempotent, so you can't call
it again, but you _could_ call getpeername() instead.  In principle
you could have a connectionless (SOCK_DGRAM) AF_UNIX socket and then
getpeername() wouldn't work, but does anyone actually _want_ the
peer address when serving from an AF_UNIX socket, as opposed to
the SCM_CREDENTIALS ancillary message or the SO_PEERCRED sockopt query?

zw


More information about the Libc-alpha mailing list