Re: questions about bug 4737 (fork is not async-signal-safe)

I think I found the corrupted _IO_list_all problem.

It has nothing to do with the earlier mentioned discussion on
This is an application SW design problem which can cause deadlocks.

Nor has it anything to do with the use of fork from within a
signal handler.

The problem is dprintf/dvprintf.
If a multi-threaded application uses fork and dprintf (by different
threads at about the same time) the child process can crash
in fresetlockfiles.

dprintf adds to the global _IO_list_all a temporary
struct _IO_FILE_plus (tmpfil) for which member _lock is NULL.
Here's the code I'm talking about:

 32 int
 33 _IO_vdprintf (d, format, arg)
 34      int d;
 35      const char *format;
 36      _IO_va_list arg;
 37 {
 38   struct _IO_FILE_plus tmpfil;
 39   struct _IO_wide_data wd;
 40   int done;
 42 #ifdef _IO_MTSAFE_IO
 43   tmpfil.file._lock = NULL;
 44 #endif
 45   _IO_no_init (&tmpfil.file, _IO_USER_LOCK, 0, &wd, &_IO_wfile_jumps);
 46   _IO_JUMPS (&tmpfil) = &_IO_file_jumps;
 47   INTUSE(_IO_file_init) (&tmpfil);
 49   tmpfil.vtable = NULL;
 50 #endif
 51   if (INTUSE(_IO_file_attach) (&tmpfil.file, d) == NULL)
 52     {
 53       INTUSE(_IO_un_link) (&tmpfil);
 54       return EOF;
 55     }
 56   tmpfil.file._IO_file_flags =
 57     (_IO_mask_flags (&tmpfil.file, _IO_NO_READS,
 58                      _IO_NO_READS+_IO_NO_WRITES+_IO_IS_APPENDING)
 59      | _IO_DELETE_DONT_CLOSE);
 61   done = INTUSE(_IO_vfprintf) (&tmpfil.file, format, arg);
 63   _IO_FINISH (&tmpfil.file);
 65   return done;
 66 }

If _IO_file_init returns, adding to _IO_list_all is done
and the list_all_lock is released.
If another thread calls fork at this time (before tmpfil
has been removed from _IO_list_all) the child process
crashes in fresetlockfiles.

The reason it crashes is because fresetlockfiles
re-initializes the file locks by writing to the _lock member
(to some default "_IO_lock_initializer" value). But the
_lock member of the "struct _IO_FILE_plus" coming from dprintf
is NULL.
Here's the code I'm talking about:

 42 static void
 43 fresetlockfiles (void)
 44 {
 45   _IO_ITER i;
 47   for (i = _IO_iter_begin(); i != _IO_iter_end(); i = _IO_iter_next(i))
 48     _IO_lock_init (*((_IO_lock_t *) _IO_iter_file(i)->_lock));
 49 }

The chance for this problem to occur is very small. dprintf or vdprintf (_IO_vdprintf) needs to be interrupted after adding tmpfil and before removing it. This is a very tiny window.

I did check the source code of glibc-latest and it seems to be
the problem is still there.

I could anyway simply work around our problem by avoiding
dprintf (we now use sprintf + write(2)).
So now we can happily continue using glibc-2.7 on our
powerpc 32bit platform :-)

Norbert van Bolhuis.
AimValley B.V.

