Re: problems with gawk 3.1.5-3 hanging -- more info

On Thu, 30 Mar 2006, David Carter wrote:

> Igor Peshansky wrote:
> > On Thu, 30 Mar 2006, David Carter wrote:
> > > It appears to me that by opening the file as O_TEXT, that gawk is
> > > hanging because it is waiting for that LF char to follow the CR
> > > (which never comes). Does this sound likely to you?
> >
> > If this theory were true, "echo -ne 'aa\rb' | gawk '{print $0}'" would
> > hang.  It doesn't for me, even with textmode pipes...
> Yes, I realized this myself soon after posting. Your echo command
> doesn't hang for me either. As I said in my original post, this is one
> of those annoying bugs that if I try to make it hang interactively, it
> always works correctly (never hangs), but if I try to do it with my
> regular script, it (usually, but not always) hangs.  This is another
> clue that my initial "theory" was incorrect: if it were true, the
> program would hang regardless.
> Here's an example line, callable from a prompt, that usually hangs:
> $ rsync -Pv sourcefile rmachine:/rpath/ | \
>   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
> To test this, I recommend using a source/remote combination for rsync
> that will take about 30 seconds to a minute to complete. This will
> create enough output for gawk to replicate the issue.
> If this hangs (it may not hang the first time; give it 2 or 3 runs),
> you'll stop getting output to stdout and it will just sit there. If you
> go to another prompt to do a ps, you'll see that rsync is done running
> but gawk is still sitting there. CTRL+C in the window running the script
> does nothing. You need to kill the gawk process from another bash
> prompt.
> > Try saving the output of rsync to file and running gawk over that
> > separately...
> Good idea. Per your advice, I tried doing something like the following:
> $ rsync -Pv sourcefile rmachine:/rpath/ > rsync.out

I would at least try "$ rsync -Pv ... | cat > rsync.out", to make sure it
goes through a pipe first.

> $ cat rsync.out | \
>   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

Does "gawk 'BEGIN ...' < rsync.out" hang?

> Surprisingly, that code never hangs. Also, this never hangs:
> $ rsync -Pv sourcefile rmachine:/rpath/ | xxd | xxd -r | \
>   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
> However, this usually hangs:
> $ rsync -Pv sourcefile rmachine:/rpath/ | cat |
>   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

Sounds like it would also not hang if you added "nobinmode" to your CYGWIN
environment variable.

Also, does it help if you use the ASCII values of \r and \n instead (i.e.,
'BEGIN { RS = "\012|\015" } ...')?

> > Also, if gawk really hangs, you can run it under strace to see exactly
> > what it was doing up to the hang (but please don't post the strace
> > output unless you're asked to do so by Corinna or CGF).
> I tried something like the following:

I'll let others try to figure the strace output -- no ideas at the

> All of this makes me wonder if:
>   a) rsync is perhaps doing something with its stdout file descriptor
> that it shouldn't be doing, or that;
>   b) gawk is perhaps doing something with its stdin file descriptor that
> it shouldn't be doing.
> If a), then why doesn't it break when I just redirect the output of
> rsync to a file?

Because in one case the input comes from a pipe, and in the other from a
file.  Those are different.

> If b), then what is it about piping the output of rsync to gawk that is
> different (from gawk's point of view) than when I just save the rsync
> output to a file and then send the contents of the file to gawk?

Again, completely different mechanisms are invoked within Cygwin when
reading from a pipe and from a file.

> And another thing...why would any of this make any difference if gawk
> opens the file as O_TEXT vs O_BINARY?

Again, no ideas yet.
