The unreliability of AF_UNIX datagram sockets

Ken Brown kbrown@cornell.edu
Tue Apr 27 15:47:34 GMT 2021


This is a follow-up to

  https://cygwin.com/pipermail/cygwin/2021-April/248383.html

I'm attaching a test case slightly simpler than the one posted by the OP in that 
thread.  This is a client/server scenario, with non-blocking AF_UNIX datagram 
sockets.  The client writes COUNT messages while the server is playing with his 
toes.  Then the server reads the messages.

If COUNT is too big, the expectation is that the client's sendto call will 
eventually return EAGAIN.  This is what happens on Linux.  On Cygwin, however, 
there is never a sendto error; the program ends when recv fails with EAGAIN, 
indicating that some messages were dropped.

I think what's happening is that WSASendTo is silently dropping messages without 
returning an error.  I guess this is acceptable because of the documented 
unreliability of AF_INET datagram sockets.  But AF_UNIX datagram sockets are 
supposed to be reliable.

I can't think of anything that Cygwin can do about this (but I would love to be 
proven wrong).  My real reason for raising the issue is that, as we recently 
discussed in a different thread, maybe it's time for Cygwin to start using 
native Windows AF_UNIX sockets.  But then we would still have to come up with 
our own implementation of AF_UNIX datagram sockets, and it seems that we can't 
simply use the current implementation.  AFAICT, Mark's suggestion of using 
message queues is the best idea so far.

I'm willing to start working on the switch to native AF_UNIX sockets.  (I'm 
frankly getting bored with working on the pipe implementation, and this doesn't 
really seem like it has much of a future.)  But I'd like to be confident that 
there's a good solution to the datagram problem before I invest too much time in 
this.

Ken

-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <errno.h>
#include <unistd.h>

#define SOCK_PATH "/tmp/mysocket"

int sfd;

int
server ()
{
  struct sockaddr_un un;

  if (unlink (SOCK_PATH) < 0 && errno != ENOENT)
    {
      printf ("unlink: %d <%s>\n", errno, strerror (errno));
      return -1;
    }
  sfd = socket (AF_UNIX, SOCK_DGRAM | SOCK_NONBLOCK, 0);
  if (sfd < 0)
    {
      printf ("SRV socket: %d <%s>\n", errno, strerror (errno));
      return -1;
    }
  memset (&un, 0, sizeof un);
  un.sun_family = AF_UNIX;
  strcpy (un.sun_path, SOCK_PATH);
  if (bind (sfd, (const struct sockaddr *) &un, sizeof un) < 0)
    {
      printf ("SRV bind: %d <%s>\n", errno, strerror (errno));
      return -1;
    }
  return 0;
}

int
main ()
{
  int fd;
  struct sockaddr_un un;

  fd = socket (AF_UNIX, SOCK_DGRAM | SOCK_NONBLOCK, 0);
  if (fd < 0)
    {
      printf ("socket: %d <%s>\n", errno, strerror (errno));
      return 1;
    }

  if (server ())
    return 2;

  memset (&un, 0, sizeof un);
  un.sun_family = AF_UNIX;
  strcpy (un.sun_path, SOCK_PATH);

#define COUNT 64 * 1024
  for (int i = 0; i < COUNT; i++)
    {
      if (sendto (fd, &i, sizeof i, 0, (struct sockaddr *) &un, sizeof un)
	  != sizeof i)
	{
	  printf ("sendto: %d <%s>, i = %d\n", errno, strerror (errno), i);
	  return 3;
	}
    }
  for (int i = 0; i < COUNT; i++)
    {
      int j = -1;
      ssize_t nr = recv (sfd, &j, sizeof j, 0);

      if (nr < 0)
	{
	  printf ("recv: %d <%s>, i = %d\n", errno, strerror (errno), i);
	  return 4;
	}
      if (nr != sizeof j)
	{
	  printf ("partial read, i = %d\n", i);
	  return 5;
	}
      if (i != j)
	printf ("i = %d, j = %d\n", i, j);
    }
}


More information about the Cygwin-developers mailing list