This is the mail archive of the glibc-bugs@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/651] New: The getpid() cache (and TLS) is not invalidated after clone() calls


Version: This is tested with any recent glibc - in fact, we have indirect 
evidence that it existed since the times when Linux 2.6.0 was released. This 
evidence is that this bug affects UML in a specific way, which was first 
reported as UML bug against NPTL glibc when 2.6.0 was just released.

Below you find the original description and a test program, from Bodo Stroesser 
( bstroesser (!) fujitsu-siemens (dot) com ).

When he he talks about static and dynamic linking, he refers to the fact that on 
most distros (even Fedora) to be 2.4 compatible, in /lib (or /lib/i686) and for 
static linking LinuxThreads is used, while for 2.6 (only with dynamic linking) 
there is the NPTL lib in /lib/tls.

"I wrote a small test program (attached). Please compile it with:

    gcc -static -o test_getpid_static test_getpid.c        and
    gcc -o test_getpid_nptl test_getpid.c

Using the two programs, you can see the following:
1) having linked my test with -static, each "getpid()" in the test results in a
    syscall (try "strace test_getpid_static")
2) linking without -static (I assume, this means using NPTL), only the first
    getpid() does a syscall, I guess, the further calls deliver a pid-value
    buffered in the lib! (try "strace test_getpid_static")
The test here requests and prints out its pid twice, then it exits. But strace 
will
show you two getpid()-calls only in case of the _static program.

3) the pid-history seen in 1 and 2 is used even for a child created with 
"clone()",
    regardless which clone-flags are used! Try "test_getpid_static clone" and
    "test_getpid_nptl clone" to see, what happens.
    After printing its pid twice, the program now creates a child via "clone()". 
The
    child requests its *real* pid via a "by-hand-syscall". Than it stops itself 
and
    is ptraced by the father, which prints out a message, if the child does a 
*real*
    getpid()-syscall.
    Note: If you remove the two getpid()-calls at the beginning of main(), the
    child will work correctly even with NPTL, since there isn't yet a buffered 
pid ...

Summary: I guess, the behavior of NPTL is a bug.

Bodo"

I have to add something more:

*) Since to cache the getpid() value probably glibc uses the TLS support, I 
guess that there will be problems with the whole TLS area itself (no separate 
TLS area is set-up).

*) The test-program behaves correctly with LinuxThreads, so NPTL *must* be fixed 
to support it, even if the clone() code in glibc is supposed to simply wrap a 
syscall - I eagerly reject any argument about the opportunity of adding the TLS 
setup code to a syscall wrapper. If doing the TLS setup inside clone() is too 
problematic, then the getpid() caching must be dropped from the lib (it seems a 
micro-optimization anyway, and also Linus Torvalds disliked it).

If there is any heavy user of getpid() inside the threading code, a 
cached_getpid() could be added for it (but you could get again this problem with 
clone(), especially if you export it to userspace - for cached_getpid() this 
"bug" could simply be documented in the specification).

Also, I think the whole TLS area setup must be done by clone() for the child, 
because the programmer using the raw clone() and TLS has no simple way to do it 
while being glibc compatible: he can mimic the glibc behaviour to achieve the 
compatibility (not sure if this is indeed possible), but that would break with 
glibc internals changes.

The test program follows.

Thanks for the attention - good bye!

#include <stdio.h>
#include <signal.h>
#include <sched.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <sys/ptrace.h>
#include <asm/ptrace.h>
#include <unistd.h>
#include <asm/unistd.h>
#include <errno.h>

char childstack[ 8*1024];


int my_getpid()
{
????????long res;
????????__asm__ volatile ("?????int $0x80\n\t"
????????????????????????????????: "=a" (res)
????????????????????????????????: "0" (__NR_getpid));
????????return res;
}


int
child_fn( void * unused)
{
????????pid_t me = my_getpid();

????????printf("Child: my PID via 'int $0x80' is %d\n", me);

????????ptrace(PTRACE_TRACEME, 0, 0, 0);
????????kill( me, SIGSTOP);

????????printf("Child: my PID via 'getpid()' is %d\n", getpid());

????????return 0;
}


int
main( int argc, char ** argv)
{
????????int ret, status, suppress=0;
????????pid_t child;

????????printf("Parent: my PID is %d\n", getpid());
????????printf("Parent: my PID is %d\n", getpid());

????????if ( argc < 2 || strcmp( argv[1], "clone") )
????????????????return 0;

????????child = clone( child_fn, childstack+8*1024-4, SIGCHLD, NULL);

????????if ( child < 0 ) {
????????????????perror("clone");
????????????????exit(1);
????????}
????????printf("Parent: childs PID is %d\n", child);

????????do {
????????????????ret = waitpid(child, &status, WUNTRACED);
????????????????if(ret < 0) {
????????????????????????perror("waitpid()");
????????????????????????exit(1);
????????????????}
????????} while(WIFSTOPPED(status) && (WSTOPSIG(status) == SIGVTALRM));

????????if(!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGSTOP)) {
????????????????printf("Parent: waitpid(): expected SIGSTOP, got status = %d\n", 
status);
????????????????return 1;
????????}

????????while (1) {
????????????????if ( ptrace( PTRACE_SYSCALL, child, (void *)0, 0) < 0 ) {
????????????????????????perror("ptrace( PTRACE_SYSCALL, child, 0, 0)");
????????????????????????exit(1);
????????????????}
????????????????ret = waitpid( child, &status, 0);
????????????????if ( ret != child ) {
????????????????????????perror("waitpid");
????????????????????????return 1;
????????????????}
????????????????if ( WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP ) {
????????????????????????errno = 0;
????????????????????????ret = ptrace( PTRACE_PEEKUSER, child, (void *)
(ORIG_EAX*4), NULL);
????????????????????????if ( errno ) {
????????????????????????????????perror("ptrace( PTRACE_PEEKUSER, child, 
ORIG_EAX, NULL)");
????????????????????????????????return 1;
????????????????????????}
????????????????????????if ( ret == __NR_getpid ) {
????????????????????????????????if ( (suppress ^= 1) )
????????????????????????????????????????printf("Parent: Child does a getpid() 
syscall!\n");
????????????????????????}
????????????????}
????????????????else {
????????????????????????printf("Parent: Child's status is %x: exiting\n", 
status);
????????????????????????return (status != 0);
????????????????}
????????}
}

-- 
           Summary: The getpid() cache (and TLS) is not invalidated after
                    clone() calls
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
        AssignedTo: drepper at redhat dot com
        ReportedBy: blaisorblade_spam at yahoo dot it
                CC: glibc-bugs at sources dot redhat dot com


http://sources.redhat.com/bugzilla/show_bug.cgi?id=651

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]