Summary: | Cancel from printf not calling the cancel handler | ||
---|---|---|---|
Product: | glibc | Reporter: | Steven Munroe <sjmunroe> |
Component: | nptl | Assignee: | Ulrich Drepper <drepper.fsp> |
Status: | RESOLVED WORKSFORME | ||
Severity: | critical | CC: | amodra, bergner, glibc-bugs |
Priority: | P1 | Flags: | fweimer:
security-
|
Version: | 2.4 | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: | ||
Attachments: |
C++ test case the demonstrats the problem
A C version of the test case. glibc-bz2748.patch |
Description
Steven Munroe
2006-06-09 22:09:57 UTC
Created attachment 1073 [details]
C++ test case the demonstrats the problem
Compile with:
g++ -g -O0 thct_wrl2.C -lpthread -o thct_wrl2
Run with:
THCT_USE_CANCEL=1 ./thct_wrl2 4
So far I have verified this failure on the recent GLIBC (cvs from 06/08/2006) for ia32 (i586), powerpc32 and powerpc64. We dom't see the failure on X86_64. I don't have access to other platforms at this time. It don't see failures on any platforms for glibc-2.3.3 or glibc-2.3.4. Have not looked at 2.3.5 or 2.3.6. Looks like we get into vfprintf which calls _pthread_cleanup_pop_restore() which detects the defered cancellation and we fail into CANCELLATION_P (self). This ends up calling pthread_unwind which (atleast for powerpc) ends up in the libgcc unwind code. This where things go badly, Created attachment 1085 [details]
A C version of the test case.
Here is a C version of the test case with the problematic source extracted into
its own source file (bug.c). Compiling bug.c with -fexceptions is all that is
needed to recreate the problem. This does fail as a 32-bit x86 app as well as
32-bit and 64-bit ppc apps. With this test case, you no longer need to set the
env var.
linux% ./thct_wrl2 8
The code we're having problems with from bug.c is:
void thd_thread_2 (unsigned int ndx)
{
pthread_cleanup_push ((void (*)(void*))thd_cleanup, &ndx);
thread_body(ndx);
pthread_cleanup_pop (1);
}
This test case does seem to work with older glibcs (eg, 2.3.4).
The problem is that many functions don't have .eh_frame unwind info generated. There are 2 ways how to solve this, one is to build the whole libc with -fasynchronous-unwind-tables (that's e.g. what Fedora 7 is doing and what e.g. x86_64 or s390{,x} do by default), or write a patch similar to the one I'll attach (but while this patch handles just stuff found in the backtrace where this was cancelled, the real patch would need to investigate what are all callers of cancellable functions and make sure they are all not __THROW and built with either -fexceptions of -fasynchronous-unwind-tables. The important difference between the two is that with -fexceptions you don't get any unwind info if e.g. all callees are __THROW, with the latter you get it anyway. FYI, the testcase is buggy, passing address of an automatic variable as last pthread_create argument and dereferencing it in the thread body has undefined behavior. Created attachment 1654 [details]
glibc-bz2748.patch
Alan can look at this issue for PPC32/64? Specifically for missing/incomplete CFI impacting cancel or making -fasynchronous-unwind-tables the default for powerpc. I don't see this problem anymore. Please retest and report. no response |