Bug 14147

Summary: Async cancellation left active after longjmp out of signal handler
Product: glibc Reporter: Rich Felker <bugdal>
Component: nptlAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: carlos, drepper.fsp
Priority: P2 Flags: fweimer: security-
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Bug Depends on: 12683    
Bug Blocks:    

Description Rich Felker 2012-05-23 19:50:22 UTC
If a signal handler interrupts a function which is async-signal-safe, it's valid to exit the signal handler with longjmp. Suppose the interrupted function is also a cancellation point. Due to NPTL's implementation of cancellation points (switch to async cancellation mode, invoke the syscall, switch back), the cancellation mode will get left as asynchronous, contrary to the expectations of a conforming application, and subsequent code that is not async-cancellation-safe will get run with async cancellation, possibly causing severe memory corruption when a cancellation request arrives.

This bug is related to bug #12683 (also reported by me), but I'm reporting it separately because it's not a rare race condition but breakage in a specific usage case that will occur without any race.

Fixing all of these issues requires abandoning the naive approach of wrapping syscalls in switches to/from async cancellation mode, and instead having the cancellation signal handler check (via program counter comparison, either directly or using whatever fancy DWARF stuff is popular) to determine whether the interrupted thread was blocked at a cancellation point, and thus whether to act on cancellation.
Comment 1 Carlos O'Donell 2014-01-10 20:25:03 UTC
I'm marking this as dependent on 12683 since a solution for 12683 should consider this bug.