This is sources Bugzilla
Bugzilla Version 2.17.5
Bugzilla Bug 6910
  getpid() wrong in child's signal handler after clone() Last modified: 2008-09-23 01:12
     Query page      Enter new bug
Bug#: 6910   Hardware:   Reporter: Michael Kerrisk <mtk.manpages@gmail.com>
Host: Target: Build:
Product:     Add CC:
Component:   Version:   CC:
Remove selected CCs
Status: RESOLVED   Priority:  
Resolution: WONTFIX   Severity:  
Assigned To: Ulrich Drepper <drepper@redhat.com>   Target Milestone:  
Flags: Requestee:
  backport ()
  examined ()
  testsuite ()
Summary:
Keywords:

Attachment Description Type Created Actions
clone_getpid_sighandler_bug.c Test program text/plain 2008-09-22 11:47 Edit None
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 6910 depends on: Show dependency tree
Show dependency graph
Bug 6910 blocks:

Additional Comments:


Leave as RESOLVED WONTFIX
Reopen bug
Mark bug as VERIFIED

View Bug Activity   |   Format For Printing


Description:   Last confirmed: 0000-00-00 00:00 Opened: 2008-09-22 11:44
As at glibc 2.8, glibc caching of PIDs for getpid() means that if a signal is
delivered to the child soon after a clone() (i.e.,  before the child has a
chance to update the cache), then a call to getpid() within the signal handler
in the child returns the wrong value.   

To test this, the attached program creates a child process that continuously
sends a SIGQUIT signal to the process group.  Meanwhile the parent loops
creating children that sleep for a moment, and then terminate.  In that time,
the SIGQUIT handler will be invoked in the child.  If the getpid() cache has not
yet been updated, then it will (occasionally) happen that the values returned by
glibc's getpid() and a raw syscall(SYS_getpid) will not match.  When that
occurs, the child prints a message noting the mismatch.

If this program is invoked with any command-line argument, then it uses fork()
instead of clone().  This can be used to show that the problem does not occur
for fork().

------- Additional Comment #1 From Michael Kerrisk 2008-09-22 11:47 -------
Created an attachment (id=2959)
Test program

When running this program on glibc 2.8 on an i386 system, I see output such as
the following:

$ ./clone_getpid_sighandler_bug
Before clone getpid() = 1991
sigsender PID = 1993
getpid() mismatch (loop=2710): getpid()=1991; syscall(SYS_getpid)=4823
getpid() mismatch (loop=5383): getpid()=1991; syscall(SYS_getpid)=7504
getpid() mismatch (loop=5383): getpid()=1991; syscall(SYS_getpid)=7504

------- Additional Comment #2 From Ulrich Drepper 2008-09-22 23:56 -------
You cannot use clone this way.  In fact, nobody should use clone.  There are
assumptions made in the system about the way clone is used.  If you want to use
clone you have to do everything yourself, including preparing the thread descriptor.

------- Additional Comment #3 From Michael Kerrisk 2008-09-23 01:12 -------
(In reply to comment #2)
> You cannot use clone this way.  In fact, nobody should use clone.  There are
> assumptions made in the system about the way clone is used.  If you want to 
use
> clone you have to do everything yourself, including preparing the thread 
descriptor.

All of this does kind of beg the question: why does glibc provide a clone() 
wrapper then?

     Query page      Enter new bug
Actions: New | Query | bug # | Reports | Requests   New Account | Log In