Problems with native Unix domain sockets on Win 10/2019

Michael McMahon michael.x.mcmahon@oracle.com
Mon Sep 28 11:03:13 GMT 2020



On 26/09/2020 08:30, Michael McMahon via Cygwin wrote:
> 
> 
> On 25/09/2020 21:30, Ken Brown wrote:
>> On 9/25/2020 2:50 PM, Ken Brown via Cygwin wrote:
>>> On 9/25/2020 10:29 AM, Michael McMahon wrote:
>>>>
>>>>
>>>> On 25/09/2020 14:19, Ken Brown wrote:
>>>>> On 9/24/2020 8:01 AM, Michael McMahon wrote:
>>>>>>
>>>>>>
>>>>>> On 24/09/2020 12:26, Ken Brown wrote:
>>>>>>> On 9/23/2020 7:25 AM, Michael McMahon via Cygwin wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I searched for related issues but haven't found anything.
>>>>>>>>
>>>>>>>> I am having some trouble with Windows native Unix domain sockets
>>>>>>>> (a recent feature in Windows 10 and 2019 server) and Cygwin.
>>>>>>>> I think I possibly know the cause since I had to investigate a 
>>>>>>>> similar
>>>>>>>> looking issue on another platform built on Windows.
>>>>>>>>
>>>>>>>> The problem is that cygwin commands don't seem to recognise 
>>>>>>>> native Unix
>>>>>>>> domain sockets correctly. For example, the socket "foo.sock" should
>>>>>>>> have the same ownership and similar permissions to other files
>>>>>>>> in the example below:
>>>>>>>>
>>>>>>>> $ ls -lrt
>>>>>>>> total 2181303
>>>>>>>>
>>>>>>>> -rw-r--r--  1 mimcmah      None             1259   Sep 23 10:22 
>>>>>>>> test.c
>>>>>>>> -rwxr-xr-x  1 mimcmah      None             3680   Sep 23 10:22 
>>>>>>>> test.obj
>>>>>>>> -rwxr-xr-x  1 mimcmah      None             121344 Sep 23 10:22 
>>>>>>>> test.exe
>>>>>>>> -rw-r-----  1 Unknown+User Unknown+Group         0 Sep 23 10:23 
>>>>>>>> foo.sock
>>>>>>>> -rw-r--r--  1 mimcmah      None             144356 Sep 23 10:27 
>>>>>>>> check.ot
>>>>>>>>
>>>>>>>> A bigger problem is that foo.sock can't be deleted with the 
>>>>>>>> cygwin "rm"
>>>>>>>> command.
>>>>>>>>
>>>>>>>> $ rm -f foo.sock
>>>>>>>> rm: cannot remove 'foo.sock': Permission denied
>>>>>>>>
>>>>>>>> $ chmod 777 foo.sock
>>>>>>>> chmod: changing permissions of 'foo.sock': Permission denied
>>>>>>>>
>>>>>>>> $ cmd /c del foo.sock
>>>>>>>>
>>>>>>>> But, native Windows commands are okay, as the third example shows.
>>>>>>>>
>>>>>>>> I think the problem may relate to the way native Unix domain 
>>>>>>>> sockets are
>>>>>>>> implemented in Windows and the resulting special handling required.
>>>>>>>> They are implemented as NTFS reparse points and when opening them
>>>>>>>> with CreateFile, you need to specify the 
>>>>>>>> FILE_FLAG_OPEN_REPARSE_POINT
>>>>>>>> flag. Otherwise, you get an ERROR_CANT_ACCESS_FILE. There are other
>>>>>>>> complications unfortunately, which I'd be happy to discuss further.
>>>>>>>>
>>>>>>>> But, to reproduce it, you can compile the attached code snippet
>>>>>>>> which creates foo.sock in the current directory. Obviously, this
>>>>>>>> only works on recent versions of Windows 10 and 2019 server.
>>>>>>>
>>>>>>> Cygwin doesn't currently support native Windows AF_UNIX sockets, 
>>>>>>> as you've discovered.  See
>>>>>>>
>>>>>>> https://urldefense.com/v3/__https://cygwin.com/pipermail/cygwin/2020-June/245088.html__;!!GqivPVa7Brio!P7lIFI4rYAtWh8_DtCbRCxT-M_E4vwQ0qwzQ0p656T73BpJ0jbUkLI_bXdA6mmSL9lJcSQ$ 
>>>>>>>
>>>>>>> for the current state of AF_UNIX sockets on Cygwin, including the 
>>>>>>> possibility of using native Windows AF_UNIX sockets on systems 
>>>>>>> that support them.
>>>>>>>
>>>>>>> If all you want is for Cygwin to recognize such sockets and allow 
>>>>>>> you to apply rm, chmod, etc., I don't think it would be hard to 
>>>>>>> add that capability.  But I doubt if that's all you want.
>>>>>>>
>>>>>>> Further discussion of this will have to wait until Corinna is 
>>>>>>> available.
>>>>>>>
>>>>>>
>>>>>> Thanks for the info. It's mainly about recognition of sockets for
>>>>>> regular commands. Since these objects can exist on Windows 
>>>>>> filesystems
>>>>>> now, potentially created by any kind of Windows application,
>>>>>> it would be great if Cygwin could handle them, irrespective of 
>>>>>> whether
>>>>>> the Cygwin development environment does. Though that sounds like a
>>>>>> good idea too.
>>>>>
>>>>> I think this has a simple fix (attached), but I can't easily test 
>>>>> it because your test program doesn't compile for me.  First, I got
>>>>>
>>>>> $ gcc -o native_unix_socket native_unix_socket.c
>>>>> native_unix_socket.c:5:10: fatal error: WS2tcpip.h: No such file or 
>>>>> directory
>>>>>      5 | #include <WS2tcpip.h>
>>>>>        |          ^~~~~~~~~~~~
>>>>> compilation terminated.
>>>>>
>>>>> I fixed this by making the include file name lower case.  (My 
>>>>> system is case sensitive, so it matters.)
>>>>>
>>>>> Next:
>>>>>
>>>>> $ gcc -o native_unix_socket native_unix_socket.c
>>>>> native_unix_socket.c:8:10: fatal error: afunix.h: No such file or 
>>>>> directory
>>>>>      8 | #include <afunix.h>
>>>>>        |          ^~~~~~~~~~
>>>>> compilation terminated.
>>>>>
>>>>> There's no file afunix.h in the Cygwin distribution, but I located 
>>>>> it online and pasted in the contents.  The program now compiles but 
>>>>> fails to link:
>>>>>
>>>>> $ gcc -o native_unix_socket native_unix_socket.c
>>>>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: 
>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x3b): undefined 
>>>>> reference to `__imp_WSAStartup'
>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x3b): relocation 
>>>>> truncated to fit: R_X86_64_PC32 against undefined symbol 
>>>>> `__imp_WSAStartup'
>>>>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: 
>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0xf2): undefined 
>>>>> reference to `__imp_WSAGetLastError'
>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0xf2): relocation 
>>>>> truncated to fit: R_X86_64_PC32 against undefined symbol 
>>>>> `__imp_WSAGetLastError'
>>>>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: 
>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x13d): undefined 
>>>>> reference to `__imp_WSAGetLastError'
>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x13d): relocation 
>>>>> truncated to fit: R_X86_64_PC32 against undefined symbol 
>>>>> `__imp_WSAGetLastError'
>>>>> collect2: error: ld returned 1 exit status
>>>>>
>>>>> This is probably easy to fix too, but I don't feel like tracking it 
>>>>> down. Please send compilation instructions (that use Cygwin tools).
>>>>>
>>>>> Ken
>>>>
>>>> Hi
>>>>
>>>> Sorry, I had compiled it in a native Visual C environment.
>>>>
>>>> Assuming you have afunix.h in the current directory.
>>>>
>>>> gcc -o native_unix_socket -I. native_unix_socket.c -lws2_32
>>>>
>>>> should do it.
>>>
>>> Thanks, that works.  But now I can't reproduce your problem.  Here's 
>>> what I see, using Cygwin 3.1.7 without applying my patch:
>>>
>>> $ ./native_unix_socket.exe
>>> getsockname works
>>> fam = 1, len = 11
>>> offsetof clen = 9
>>> strlen = 8
>>> name = foo.sock
>>>
>>> $ ls -l foo.sock
>>> -rwxr-xr-x 1 kbrown None 0 2020-09-25 14:39 foo.sock*
>>>
>>> $ chmod 644 foo.sock
>>>
>>> $ ls -l foo.sock
>>> -rw-r--r-- 1 kbrown None 0 2020-09-25 14:39 foo.sock
>>>
>>> $ rm foo.sock
>>>
>>> $ ls -l foo.sock
>>> ls: cannot access 'foo.sock': No such file or directory
>>>
>>> I'm running 64-bit Cygwin on Windows 10 1909.
>>
>> I just ran the 'rm' command under gdb to see what's going on, and it 
>> seems that foo.sock is not being recognized as a reparse point.  So 
>> maybe your test program, when compiled and run under Cygwin, doesn't 
>> actually produce a native Windows AF_UNIX socket.  And when I try to 
>> run it in a Windows Command Prompt, I get
>>
>> bind failed 10050
>> getsockname failed 10022
>>
>> Can you make your version of the test executable available for me to 
>> try?  Or tell me some other way to create a native Windows AF_UNIX 
>> socket?
>>
>> Ken
> 
> That is all very strange. I have checked both the gcc compiled and MS
> compiled executables on my system (2019 server) and they are both
> definitely producing native AF_UNIX sockets.
> 
> I can email you the two exe files. They are both quite small. But, first
> I want to check the patch status of my test system.
> 

So, it turns out that this issue only happens on some of our test
systems. It does not happen on a personal copy of Windows 10 on my
laptop either.

I also noticed that some native Windows commands don't work properly on
any affected system (eg 'attrib' or 'fsutil'). Though 'fsutil' can be
used to verify that the reparse point is created correctly.

Possibly, this was a Windows bug that has been fixed. It never made
sense that you had to open socket files using the 
FILE_FLAG_OPEN_REPARSE_POINT flag, because you would have to know in
advance that the file is a socket to be able to do this (or else be
prepared to have to open the file twice). But, I don't fully
understand yet, why some systems are affected and others not.
All seem to be patched up to date.

In any case, I think it's clear this isn't a Cygwin issue.
So, apologies for the noise, and thanks for the assistance!

Regards,
Michael.


More information about the Cygwin mailing list