AF_UNIX/SOCK_DGRAM is dropping messages
Tue Apr 6 15:24:53 GMT 2021
On 4/6/2021 10:50 AM, firstname.lastname@example.org wrote:
>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to
>>>>>>>> drop messages or at least they are not received in the same order
>>>>>>>> they are sent
>>>>> Thanks for the test case. I can confirm the problem. I'm not
>>>>> familiar enough with the current AF_UNIX implementation to debug
>>>>> this easily. I'd rather spend my time on the new implementation (on
>>>>> the topic/af_unix branch). It turns out that your test case fails
>>>>> there too, but in a completely different way, due to a bug in sendto
>>>>> for datagrams. I'll see if I can fix that bug and then try again.
>>>> Ok, too bad it wasn't our own code base but good that the "mystery"
>>>> is verified
>>>> I finally succeed to build topic/af_unix (after finding out what
>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
>>>> CXXFLAGS though and thus I haven’t tested it yet
>>>> Is it sufficient to add the define to the "main" Makefile or do you
>>>> have to add it to all the Makefile:s ? I guess I can find out though
>>> I do it on the configure line, like this:
>>> ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --prefix=...
>>>> Is topic/af_unix fairly up to date with master branch ?
>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
>>> I'lldo that again right now.
>>>> Either way, I'll be glad to help out testing topic/af_unix
>> I've now pushed a fix for that sendto bug, and your test case runs without
>> error on the topic/af_unix branch.
> It seems like the test-case do work now with topic/af_unix in blocking mode, but when using non-blocking (with MSG_DONTWAIT) there are some issues I think
> 1. When the queue is empty with non-blocking recv(), errno is set to EPIPE but I think it should be EAGAIN (or maybe the pipe is getting broken for real of some reason ?)
> 2. When using non-blocking recv() and no message is written at all, it seems like recv() blocks forever
> 3. Using non-blocking recv() where the "client" does send less than "count" messages, sometimes recv() blocks forever (as well)
> My naïve analysis of this is that for the first issue (if any) the wrong errno is set and for the second issue it blocks if no sendto() is done after the first recv(), i.e. nothing kicks the "reader thread" in the butt to realise the queue is empty. It is not super clear though what POSIX says about creating blocking descriptors and then using non-blocking-flags with recv(), but this works in Linux any way
> Let me know if I should provide more a specific explanation, but I think minor modifications of the test-case can provoke all behaviours. I think 2 and 3 are of the same reason though (as described above)
Thanks, I'll take a look.
>> By the way, I think the implementation of sendto/recv for datagrams is very
>> inefficient when there are repeated calls to sendto as in your test case.
>> Nevertheless, your test case actually runs slightly faster on the topic/af_unix
>> branch than it does on master (when the latter succeeds, which it does about
>> half the time for me). So I'm not sure whether it's worth worrying about this.
> Of course we would like the best throughput possible 😉
>> Here's the issue, briefly. The communication is done via a Windows named
>> The receiver creates the pipe when it creates and binds its socket. It creates
>> only one pipe instance. The sender connects to the pipe, writes, and closes its
>> handle. But the pipe is not available for another sender to connect to until the
>> receiver reads the message, after which it disconnects the sender.
> Ok, in our application we will use long lived descriptors and multiple writers that possible send large business messages (chunked into some smaller pieces per sendto()/recv())
> Best regards,
More information about the Cygwin