name pipe problem: 1 writer, multiple concurrent readers

Ken Brown kbrown@cornell.edu
Thu Jul 16 19:26:03 GMT 2020


On 7/3/2020 7:09 AM, Ken Brown via Cygwin wrote:
> On 7/2/2020 1:50 PM, Morten Kjærulff via Cygwin wrote:
>> I think we got a new release around the beginning of June, right?
>> You said that there were still issues (I can confirm).
>> If it can help, here is the output I see today of above scripts:
>>
>> $ ./tp.sh
> [...]
>>        0 [fifo_reader] diff 1806 C:\cygwin\bin\diff.exe: *** fatal
>> error - Can't update my handlers, Win32 error 87
> 
> Thanks for the report and the simple test case.  Obviously I still have more 
> work to do on this.

Hi Morten,

I've attempted to fix the bugs (see 
https://cygwin.com/pipermail/cygwin-patches/2020q3/010380.html).  With these 
patches installed, I no longer see a fatal error or hanging diff processes.  But 
your script still doesn't work as you expect.  On a typical run of the parallel 
part,  6 or 7 of the 10 diff processes see the FIFO t.pip as empty.

Here's a sample run under strace, so that I could see what was going on.  7 of 
the 10 diff processes saw t.pip as empty on this run.

$ strace -o tpip.sh.strace sh -c ./tpip.sh
       PID    PPID    PGID     WINPID   TTY         UID    STIME COMMAND
      1307    1306    1307      10932  pty1      197609 17:50:12 /usr/bin/bash
     18426       1   18426       9360  cons0     197609 06:47:31 /usr/bin/sh
     18429   18426   18426      13900  cons0     197609 06:47:32 /usr/bin/ps
      1306       1    1306       3768  ?         197609 17:50:11 /usr/bin/mintty
     18424    1307   18424      21840  pty1      197609 06:47:31 /usr/bin/strace
result1 start
10
0
0
0
0
0
0
0
0
0
0
result1 end
0a1,2
 > line1
 > line2
0a1,2
 > line1
 > line2
0a1,2
 > line1
 > line2
0a1,2
 > line1
 > line2
0a1,2
 > line1
 > line2
0a1,2
0a1,2
 > line1
 > line1
 > line2
 > line2
result2 start
10
0
1
1
0
1
1
1
1
1
0
result2 end
       PID    PPID    PGID     WINPID   TTY         UID    STIME COMMAND
     18480   18430   18426      15580  cons0     197609 06:47:33 /usr/bin/cp
      1307    1306    1307      10932  pty1      197609 17:50:12 /usr/bin/bash
     18484   18426   18426      21264  cons0     197609 06:47:44 /usr/bin/ps
     18430   18426   18426      23472  cons0     197609 06:47:32 /usr/bin/sh
     18426       1   18426       9360  cons0     197609 06:47:31 /usr/bin/sh
      1306       1    1306       3768  ?         197609 17:50:11 /usr/bin/mintty
     18424    1307   18424      21840  pty1      197609 06:47:31 /usr/bin/strace

I'm attaching your script for ease of reference, and I'm attaching an excerpt 
from the strace output, to which I've added a few comments.  The excerpt shows 
all open, close, read, and write system calls involving t.pip.

Here's a summary of what you can see from those system calls in the parallel 
part of the script.  In what follows, I've called the diff processes diff-1, 
diff-2,..., diff-10, and similarly for the cp processes (although there are only 
four of them).

1. cp-1 tries to open t.pip for writing and blocks.  It unblocks when diff-1 
opens t.pip for reading, and both processes run to completion as expected.

2. diff-2, diff-3, diff-4, and diff-5 try to open t.pip for reading, and they 
block until cp-2 opens it for writing.  Then cp-2 writes 12 bytes to t.pip and 
closes it, and the four diff processes all try to read.  diff-4 gets there first 
and reads the 12 bytes; it reads once more and sees EOF because there is no data 
available in the pipe and there are no writers open[1], so it considers those 12 
bytes to constitute the file t.pip.  It later exits with success.

diff-2, diff-3, and diff-5 all complete their reads before cp-3 opens t.pip. 
They see EOF for the same reason as above, so t.pip appears empty and they exit 
with failure.

3. diff-6, diff-7, diff-8, diff-9, and diff-10 try to open t.pip for reading, 
and they block until cp-3 opens it for writing.  Then cp-3 writes 12 bytes to 
t.pip and closes it, and the five diff processes all try to read.  diff-10 gets 
there first and reads 12 bytes followed by EOF; it later exits with success.

diff-6, diff-7, diff-8, and diff-9 all complete their reads before cp-4 opens 
t.pip.  They see EOF, so t.pip appears empty and they exit with failure.

4. cp-4 tries to open t.pip and blocks because there are no more diff processes.

I've run your script on Linux a few times, and it usually[2] behaves as you 
expect, with all diff processes succeeding.  For reasons I don't understand, the 
diff and cp processes apparently alternate most of the time on Linux, rather 
than having 4 or 5 diff processes lumped together between the cp processes as on 
Cygwin.

If someone can figure out the reason for the difference, and if it turns out to 
be related to the FIFO code, I could try to modify the code to make Cygwin 
behave more like Linux.

Ken

[1] From https://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html:

When attempting to read from an empty pipe or FIFO:

     * If no process has the pipe open for writing, read() shall return 0 to 
indicate end-of-file.

[2] But I did have one Linux run in which one of the ten diff processes saw an 
empty t.pip and failed as on Cygwin.
-------------- next part --------------
Run diff 10 times sequentially
1. cp 18431 and diff 18433
   37   15003 [main] cp 18431 open: open(t.pip, 0x10401)
   43   32609 [main] diff 18433 open: open(t.pip, 0x0)
103856  120245 [main] cp 18431 open: 4 = open(t.pip, 0x18401)
   74   34419 [main] diff 18433 open: 3 = open(t.pip, 0x8000)
   33  120667 [main] cp 18431 write: write(4, 0x800050000, 12)
   39  120706 [main] cp 18431 write: 12 = write(4, 0x800050000, 12)
   30  120827 [main] cp 18431 close: close(4)
   34  120910 [main] cp 18431 close: 0 = close(4)
   39   36282 [main] diff 18433 read: read(3, 0x800069FF0, 65536) blocking
   31   36378 [main] diff 18433 read: 12 = read(3, 0x800069FF0, 12)
   31   36409 [main] diff 18433 read: read(3, 0x800069FFC, 65524) blocking
   31   36475 [main] diff 18433 read: 0 = read(3, 0x800069FFC, 0)
  621   38274 [main] diff 18433 close: close(3)
  118   38392 [main] diff 18433 close: 0 = close(3)

2. diff 18434 and cp 18435
   44   32793 [main] diff 18434 open: open(t.pip, 0x0)
18300   34541 [main] cp 18435 open: 4 = open(t.pip, 0x18401)
   92   34663 [main] diff 18434 open: 3 = open(t.pip, 0x8000)
   33   35126 [main] cp 18435 write: write(4, 0x800050000, 12)
   36   35162 [main] cp 18435 write: 12 = write(4, 0x800050000, 12)
   34   35296 [main] cp 18435 close: close(4)
   35   35382 [main] cp 18435 close: 0 = close(4)
   39   36716 [main] diff 18434 read: read(3, 0x800069FF0, 65536) blocking
   33   36788 [main] diff 18434 read: 12 = read(3, 0x800069FF0, 12)
   33   36821 [main] diff 18434 read: read(3, 0x800069FFC, 65524) blocking
   31   36886 [main] diff 18434 read: 0 = read(3, 0x800069FFC, 0)
  734   38790 [main] diff 18434 close: close(3)
  134   38924 [main] diff 18434 close: 0 = close(3)

3. cp 18436 and diff 18437
   41   14772 [main] cp 18436 open: open(t.pip, 0x10401)
   44   32185 [main] diff 18437 open: open(t.pip, 0x0)
25768   41893 [main] cp 18436 open: 4 = open(t.pip, 0x18401)
   69   34057 [main] diff 18437 open: 3 = open(t.pip, 0x8000)
   43   42441 [main] cp 18436 write: write(4, 0x800050000, 12)
   46   42487 [main] cp 18436 write: 12 = write(4, 0x800050000, 12)
   35   42633 [main] cp 18436 close: close(4)
   35   42719 [main] cp 18436 close: 0 = close(4)
   36   36061 [main] diff 18437 read: read(3, 0x800069FF0, 65536) blocking
   29   36126 [main] diff 18437 read: 12 = read(3, 0x800069FF0, 12)
   32   36158 [main] diff 18437 read: read(3, 0x800069FFC, 65524) blocking
   30   36225 [main] diff 18437 read: 0 = read(3, 0x800069FFC, 0)
  646   38053 [main] diff 18437 close: close(3)
  121   38174 [main] diff 18437 close: 0 = close(3)

4. cp 18438 and diff 18439
   43   32153 [main] diff 18439 open: open(t.pip, 0x0)
24135   40474 [main] cp 18438 open: 4 = open(t.pip, 0x18401)
   77   33806 [main] diff 18439 open: 3 = open(t.pip, 0x8000)
   39   40861 [main] cp 18438 write: write(4, 0x800050000, 12)
   39   40900 [main] cp 18438 write: 12 = write(4, 0x800050000, 12)
   29   41048 [main] cp 18438 close: close(4)
   43   41151 [main] cp 18438 close: 0 = close(4)
   42   35669 [main] diff 18439 read: read(3, 0x800069FF0, 65536) blocking
   30   35729 [main] diff 18439 read: 12 = read(3, 0x800069FF0, 12)
   34   35763 [main] diff 18439 read: read(3, 0x800069FFC, 65524) blocking
   33   35831 [main] diff 18439 read: 0 = read(3, 0x800069FFC, 0)
  623   37617 [main] diff 18439 close: close(3)
  123   37740 [main] diff 18439 close: 0 = close(3)

5. cp 18440 and diff 18441
   37   15221 [main] cp 18440 open: open(t.pip, 0x10401)
   42   31887 [main] diff 18441 open: open(t.pip, 0x0)
24291   40881 [main] cp 18440 open: 4 = open(t.pip, 0x18401)
   38   33908 [main] diff 18441 open: 3 = open(t.pip, 0x8000)
   41   41256 [main] cp 18440 write: write(4, 0x800050000, 12)
   45   41301 [main] cp 18440 write: 12 = write(4, 0x800050000, 12)
   36   41445 [main] cp 18440 close: close(4)
   37   41531 [main] cp 18440 close: 0 = close(4)
   31   35775 [main] diff 18441 read: read(3, 0x800069FF0, 65536) blocking
   31   35847 [main] diff 18441 read: 12 = read(3, 0x800069FF0, 12)
   30   35877 [main] diff 18441 read: read(3, 0x800069FFC, 65524) blocking
   30   35940 [main] diff 18441 read: 0 = read(3, 0x800069FFC, 0)
  683   37830 [main] diff 18441 close: close(3)
  138   37968 [main] diff 18441 close: 0 = close(3)

6. cp 18442 and diff 18443
   35   15099 [main] cp 18442 open: open(t.pip, 0x10401)
   42   31183 [main] diff 18443 open: open(t.pip, 0x0)
23245   39624 [main] cp 18442 open: 4 = open(t.pip, 0x18401)
   77   32896 [main] diff 18443 open: 3 = open(t.pip, 0x8000)
   36   39964 [main] cp 18442 write: write(4, 0x800050000, 12)
   37   40001 [main] cp 18442 write: 12 = write(4, 0x800050000, 12)
   39   40141 [main] cp 18442 close: close(4)
   37   40233 [main] cp 18442 close: 0 = close(4)
   39   34681 [main] diff 18443 read: read(3, 0x800069FF0, 65536) blocking
   26   34744 [main] diff 18443 read: 12 = read(3, 0x800069FF0, 12)
   29   34773 [main] diff 18443 read: read(3, 0x800069FFC, 65524) blocking
   31   34839 [main] diff 18443 read: 0 = read(3, 0x800069FFC, 0)
  636   36716 [main] diff 18443 close: close(3)
  127   36843 [main] diff 18443 close: 0 = close(3)

7. cp 18444 and diff 18445
   40   14921 [main] cp 18444 open: open(t.pip, 0x10401)
   42   32057 [main] diff 18445 open: open(t.pip, 0x0)
24406   40658 [main] cp 18444 open: 4 = open(t.pip, 0x18401)
   77   33776 [main] diff 18445 open: 3 = open(t.pip, 0x8000)
   41   41052 [main] cp 18444 write: write(4, 0x800050000, 12)
   40   41092 [main] cp 18444 write: 12 = write(4, 0x800050000, 12)
   36   41229 [main] cp 18444 close: close(4)
   34   41313 [main] cp 18444 close: 0 = close(4)
   39   35627 [main] diff 18445 read: read(3, 0x800069FF0, 65536) blocking
   31   35695 [main] diff 18445 read: 12 = read(3, 0x800069FF0, 12)
   34   35729 [main] diff 18445 read: read(3, 0x800069FFC, 65524) blocking
   34   35798 [main] diff 18445 read: 0 = read(3, 0x800069FFC, 0)
  641   37591 [main] diff 18445 close: close(3)
  119   37710 [main] diff 18445 close: 0 = close(3)

8. cp 18446 and diff 18447
   43   14974 [main] cp 18446 open: open(t.pip, 0x10401)
   43   31763 [main] diff 18447 open: open(t.pip, 0x0)
24704   41011 [main] cp 18446 open: 4 = open(t.pip, 0x18401)
   90   33706 [main] diff 18447 open: 3 = open(t.pip, 0x8000)
   33   41543 [main] cp 18446 write: write(4, 0x800050000, 12)
   38   41581 [main] cp 18446 write: 12 = write(4, 0x800050000, 12)
   29   41714 [main] cp 18446 close: close(4)
   34   41793 [main] cp 18446 close: 0 = close(4)
   41   35699 [main] diff 18447 read: read(3, 0x800069FF0, 65536) blocking
   32   35763 [main] diff 18447 read: 12 = read(3, 0x800069FF0, 12)
   34   35797 [main] diff 18447 read: read(3, 0x800069FFC, 65524) blocking
   35   35866 [main] diff 18447 read: 0 = read(3, 0x800069FFC, 0)
  645   37695 [main] diff 18447 close: close(3)
  122   37817 [main] diff 18447 close: 0 = close(3)

9. cp 18448 and diff 18449
   49   31604 [main] diff 18449 open: open(t.pip, 0x0)
23502   39928 [main] cp 18448 open: 4 = open(t.pip, 0x18401)
   60   33367 [main] diff 18449 open: 3 = open(t.pip, 0x8000)
   35   40336 [main] cp 18448 write: write(4, 0x800050000, 12)
   36   40372 [main] cp 18448 write: 12 = write(4, 0x800050000, 12)
   29   40503 [main] cp 18448 close: close(4)
   36   40585 [main] cp 18448 close: 0 = close(4)
   37   35212 [main] diff 18449 read: read(3, 0x800069FF0, 65536) blocking
   25   35269 [main] diff 18449 read: 12 = read(3, 0x800069FF0, 12)
   29   35298 [main] diff 18449 read: read(3, 0x800069FFC, 65524) blocking
   30   35357 [main] diff 18449 read: 0 = read(3, 0x800069FFC, 0)
  626   37152 [main] diff 18449 close: close(3)
  118   37270 [main] diff 18449 close: 0 = close(3)

10. cp 18450 and diff 18451
   38   15186 [main] cp 18450 open: open(t.pip, 0x10401)
   44   32061 [main] diff 18451 open: open(t.pip, 0x0)
26172   42629 [main] cp 18450 open: 4 = open(t.pip, 0x18401)
   34   33628 [main] diff 18451 open: 3 = open(t.pip, 0x8000)
   34   42998 [main] cp 18450 write: write(4, 0x800050000, 12)
   43   43041 [main] cp 18450 write: 12 = write(4, 0x800050000, 12)
   37   43179 [main] cp 18450 close: close(4)
   35   43261 [main] cp 18450 close: 0 = close(4)
   36   35444 [main] diff 18451 read: read(3, 0x800069FF0, 65536) blocking
   30   35511 [main] diff 18451 read: 12 = read(3, 0x800069FF0, 12)
   35   35546 [main] diff 18451 read: read(3, 0x800069FFC, 65524) blocking
   30   35611 [main] diff 18451 read: 0 = read(3, 0x800069FFC, 0)
  696   37539 [main] diff 18451 close: close(3)
  159   37698 [main] diff 18451 close: 0 = close(3)

Run diff 10 times in parallel
a. cp 18452 has been blocking for a long time trying to open.
   diff 18459 opens; cp 18452 and diff 18459 quickly finish before the
   next diff or cp tries to open.
   41   19137 [main] cp 18452 open: open(t.pip, 0x10401)
   94   55546 [main] diff 18459 open: open(t.pip, 0x0)
351587  372145 [main] cp 18452 open: 4 = open(t.pip, 0x18401)
  142   59237 [main] diff 18459 open: 3 = open(t.pip, 0x8000)
  114  373531 [main] cp 18452 write: write(4, 0x800050000, 12)
   89  373620 [main] cp 18452 write: 12 = write(4, 0x800050000, 12)
  104  374064 [main] cp 18452 close: close(4)
   85  374284 [main] cp 18452 close: 0 = close(4)
  157   66100 [main] diff 18459 read: read(3, 0x800069FF0, 65536) blocking
  111   66329 [main] diff 18459 read: 12 = read(3, 0x800069FF0, 12)
   90   66419 [main] diff 18459 read: read(3, 0x800069FFC, 65524) blocking
   61   66584 [main] diff 18459 read: 0 = read(3, 0x800069FFC, 0)
 2233   72260 [main] diff 18459 close: close(3)
  262   72522 [main] diff 18459 close: 0 = close(3)

b. diff 18461, 18463, 18465, 18467 try to open.
   cp 18472 opens, unblocking the four diffs, which all try to read.
   diff 18465 reads successfully, the others see EOF, all before the
   next cp tries to open.
   [diff 18469 and 18471 try to open after cp 18472 has closed]
  141   66485 [main] diff 18461 open: open(t.pip, 0x0)
  137   78681 [main] diff 18463 open: open(t.pip, 0x0)
  116   83771 [main] diff 18465 open: open(t.pip, 0x0)
  141   85018 [main] diff 18467 open: open(t.pip, 0x0)
  115   31038 [main] cp 18472 open: open(t.pip, 0x10401)
  177   33462 [main] cp 18472 open: 4 = open(t.pip, 0x18401)
74078  157243 [main] diff 18463 open: 3 = open(t.pip, 0x8000)
44063  131988 [main] diff 18465 open: 3 = open(t.pip, 0x8000)
14695  103578 [main] diff 18467 open: 3 = open(t.pip, 0x8000)
   55  176955 [main] diff 18461 open: 3 = open(t.pip, 0x8000)
   94   34471 [main] cp 18472 write: write(4, 0x800050000, 12)
  105   34576 [main] cp 18472 write: 12 = write(4, 0x800050000, 12)
   95   34979 [main] cp 18472 close: close(4)
  134   35264 [main] cp 18472 close: 0 = close(4)
  203  139097 [main] diff 18465 read: read(3, 0x800069FF0, 65536) blocking
  212  110708 [main] diff 18467 read: read(3, 0x800069FF0, 65536) blocking
  252  164424 [main] diff 18463 read: read(3, 0x800069FF0, 65536) blocking
  262  184121 [main] diff 18461 read: read(3, 0x800069FF0, 65536) blocking
[  109   79383 [main] diff 18469 open: open(t.pip, 0x0)]
   97  139800 [main] diff 18465 read: 12 = read(3, 0x800069FF0, 12)
  130  139930 [main] diff 18465 read: read(3, 0x800069FFC, 65524) blocking
  160  185245 [main] diff 18461 read: 0 = read(3, 0x800069FF0, 0)
  176  140799 [main] diff 18465 read: 0 = read(3, 0x800069FFC, 0)
 2324  147324 [main] diff 18465 close: close(3)
  304  147628 [main] diff 18465 close: 0 = close(3)
  916  197811 [main] diff 18461 close: close(3)
  177  197988 [main] diff 18461 close: 0 = close(3)
   86  181046 [main] diff 18463 read: 0 = read(3, 0x800069FF0, 0)
[  111   76236 [main] diff 18471 open: open(t.pip, 0x0)]
 1247  192785 [main] diff 18463 close: close(3)
  344  193129 [main] diff 18463 close: 0 = close(3)
  122  143456 [main] diff 18467 read: 0 = read(3, 0x800069FF0, 0)
 1066  154923 [main] diff 18467 close: close(3)
  291  155214 [main] diff 18467 close: 0 = close(3)

c. diff 18469, 18471 have already tried to open.
   Now diffs 18474, 18476, 18477 also try.
   cp 18479 opens, unblocking the five diffs, which all try to read.
   diff 18477 reads successfully, the others see EOF.
  117   75684 [main] diff 18474 open: open(t.pip, 0x0)
   57   68789 [main] diff 18476 open: open(t.pip, 0x0)
   51   53390 [main] diff 18477 open: open(t.pip, 0x0)
   37   14787 [main] cp 18479 open: open(t.pip, 0x10401)
   98   16125 [main] cp 18479 open: 4 = open(t.pip, 0x18401)
 5873   60930 [main] diff 18477 open: 3 = open(t.pip, 0x8000)
41946  121620 [main] diff 18474 open: 3 = open(t.pip, 0x8000)
23125   93685 [main] diff 18476 open: 3 = open(t.pip, 0x8000)
65628  146078 [main] diff 18471 open: 3 = open(t.pip, 0x8000)
   47  175332 [main] diff 18469 open: 3 = open(t.pip, 0x8000)
   64   16848 [main] cp 18479 write: write(4, 0x800050000, 12)
   86   16934 [main] cp 18479 write: 12 = write(4, 0x800050000, 12)
   81   17241 [main] cp 18479 close: close(4)
   82   17392 [main] cp 18479 close: 0 = close(4)
   72   64964 [main] diff 18477 read: read(3, 0x800069FF0, 65536) blocking
   75  125673 [main] diff 18474 read: read(3, 0x800069FF0, 65536) blocking
   83   97764 [main] diff 18476 read: read(3, 0x800069FF0, 65536) blocking
   78  150184 [main] diff 18471 read: read(3, 0x800069FF0, 65536) blocking
   98  179463 [main] diff 18469 read: read(3, 0x800069FF0, 65536) blocking
   44   65392 [main] diff 18477 read: 12 = read(3, 0x800069FF0, 12)
   74   65466 [main] diff 18477 read: read(3, 0x800069FFC, 65524) blocking
   63  150850 [main] diff 18471 read: 0 = read(3, 0x800069FF0, 0)
   37   65904 [main] diff 18477 read: 0 = read(3, 0x800069FFC, 0)
  761   68071 [main] diff 18477 close: close(3)
  183   68254 [main] diff 18477 close: 0 = close(3)
  342  155780 [main] diff 18471 close: close(3)
  104  155884 [main] diff 18471 close: 0 = close(3)
   34  195050 [main] diff 18469 read: 0 = read(3, 0x800069FF0, 0)
  354  200567 [main] diff 18469 close: close(3)
  220  200787 [main] diff 18469 close: 0 = close(3)
   47  129311 [main] diff 18476 read: 0 = read(3, 0x800069FF0, 0)
   58  157414 [main] diff 18474 read: 0 = read(3, 0x800069FF0, 0)
  529  137014 [main] diff 18476 close: close(3)
  161  137175 [main] diff 18476 close: 0 = close(3)
  560  165180 [main] diff 18474 close: close(3)
  188  165368 [main] diff 18474 close: 0 = close(3)

d. cp 18480 tries to open but blocks because there are no more diff processes.
   35   15508 [main] cp 18480 open: open(t.pip, 0x10401)
-------------- next part --------------
#!/bin/sh

rm -f t.pip t.txt

mkfifo t.pip
printf "line1\nline2\n" >t.txt

ps

{
  while true ; do
    cp t.txt t.pip
  done
} &

rm -f t.rc.*
for rc in 0 1 2 3 4 5 6 7 8 9 ; do
  {
    diff t.pip t.txt
    echo $? >t.rc.$rc
  }
done
echo result1 start
ls  t.rc.* | wc -l
cat t.rc.*
echo result1 end

rm -f t.rc.*
for rc in 0 1 2 3 4 5 6 7 8 9 ; do
  {
    diff t.pip t.txt
    echo $? >t.rc.$rc
  } & # run the readers in parallel
done
sleep 10
echo result2 start
ls  t.rc.* | wc -l
cat t.rc.*
echo result2 end

ps


More information about the Cygwin mailing list