This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
questions about bug 4737 (fork is not async-signal-safe)
- From: Norbert van Bolhuis <nvbolhuis at aimvalley dot nl>
- To: libc-help at sourceware dot org
- Date: Tue, 10 May 2011 16:31:30 +0200
- Subject: questions about bug 4737 (fork is not async-signal-safe)
I have some questions regarding libc bug 4737, see:
http://sourceware.org/bugzilla/show_bug.cgi?id=4737
The reason is I'm running into the corrupted _IO_list_all
problem for some of our application threads that use system(3)
(which is implemented with fork + exec).
To be precise the child process crashes in fresetlockfiles
on line 48:
42 static void
43 fresetlockfiles (void)
44 {
45 _IO_ITER i;
46
47 for (i = _IO_iter_begin(); i != _IO_iter_end(); i = _IO_iter_next(i))
48 _IO_lock_init (*((_IO_lock_t *) _IO_iter_file(i)->_lock));
49 }
"glibc-2.7/nptl/sysdeps/unix/sysv/linux/fork.c"
Other have seen this as well, see:
http://sourceware.org/ml/libc-help/2010-06/msg00027.html
http://sourceware.org/ml/libc-help/2010-07/msg00001.html
http://sourceware.org/ml/libc-help/2010-03/msg00008.html
http://plash.beasts.org/wiki/PlashIssues/ForkDeadlock
I believe it is caused by our application that can stop
the threads asynchronously by using pthread_cancel.
Note that libc implements pthread_cancel by sending a
SIGCANCEL (signr=32) with tkill.
Does anyone know why exactly _IO_list_all can get corrupted ?
Is it not allowed to pthread_cancel threads that use system(3) ?
Why is bug 4737 still open ?
(It almost looks like it is not accepted fork is not async-signal-safe,
well at least my signal(7) man page says fork is async-signal-safe)
Can the same problem occur because on interrupts (i.s.o. signals) ?
I'd think if an asynchronous hardware IRQ interrupts the atfork
prepare handlers and the scheduler decides to continue the child
process (so parent process did not complete the atfork parent)
the same problem would occur.
Is this bug solved in the latest glibc ?
and/or does:
http://sourceware.org/git/?p=glibc.git;a=commit;h=9437b427cec6266abd303983848549a5c4ba0d0a
maybe fix the issue ?
Thanks,
Norbert van Bolhuis.