Bug 29214

Summary: pthread_setcanceltype fails to set type
Product: glibc Reporter: Robin Gareus <robin>
Component: nptlAssignee: Adhemerval Zanella <adhemerval.zanella>
Status: RESOLVED FIXED    
Severity: normal CC: adhemerval.zanella, drepper.fsp
Priority: P2    
Version: 2.35   
Target Milestone: 2.36   
Host: Target:
Build: Last reconfirmed: 2022-05-31 00:00:00
Attachments: example code to reproduce the bug

Description Robin Gareus 2022-05-31 17:19:20 UTC
Created attachment 14127 [details]
example code to reproduce the bug

With glibc-2.35-69-g28ea43f8d6 (as currently shipped by archlinux and openSuSE) it is no longer possible to change the pthread cancel type to `PTHREAD_CANCEL_ASYNCHRONOUS'.

This leads to applications that rely on async cancellation to never terminate [1].

The problem was introduced by commit ba9c42ac0e265bf1e4ec1075fa20e7166fda8bfc, and reverting that fixes the issue [2].

Attached is a small test-tool to reproduce the bug.

--

[1] https://discourse.ardour.org/t/ardour-hanging-at-shutdown/107260 in this case JACK waits on a futex, just like the test-tool.

[2] The problem seems to be specific to the change in `pthread_setcanceltype', not the overall commit.
Comment 1 Adhemerval Zanella 2022-05-31 18:45:53 UTC
It affects master as well, I will take a look.
Comment 2 Adhemerval Zanella 2022-05-31 18:56:11 UTC
The problem seems not to be on pthread_setcanceltype, but rather on nptl/libc-clenaup.c that implements handlers used by printf.  If you remove the printf on the thread function the thread is cancelled as expected. Could you check if the following patch fixes it?

diff --git a/nptl/libc-cleanup.c b/nptl/libc-cleanup.c
index c4a83591bf..2ce59388d4 100644
--- a/nptl/libc-cleanup.c
+++ b/nptl/libc-cleanup.c
@@ -57,7 +57,8 @@ __libc_cleanup_pop_restore (struct _pthread_cleanup_buffer *buffer)
   THREAD_SETMEM (self, cleanup, buffer->__prev);
 
   int cancelhandling = atomic_load_relaxed (&self->cancelhandling);
-  if (cancelhandling & CANCELTYPE_BITMASK)
+  if (buffer->__canceltype != PTHREAD_CANCEL_DEFERRED
+      && (cancelhandling & CANCELTYPE_BITMASK) == 0)
     {
       int newval;
       do


I will send a patch along with a testcase.
Comment 3 Robin Gareus 2022-05-31 19:12:02 UTC
Yes, that patch fixes the issue. Confirmed with both the test-tool, as well as the "real world" issue with jackd + Ardour!

Thanks for the amazingly fast response and fix!
Comment 4 Adhemerval Zanella 2022-05-31 19:16:30 UTC
(In reply to Robin Gareus from comment #3)
> Yes, that patch fixes the issue. Confirmed with both the test-tool, as well
> as the "real world" issue with jackd + Ardour!
> 
> Thanks for the amazingly fast response and fix!

Well, I was the one that actually messed up the revert ;)
Comment 5 Robin Gareus 2022-06-01 01:41:19 UTC
> The problem seems not to be on pthread_setcanceltype, but rather on nptl/libc-clenaup.c that implements handlers used by printf.

The issue is also present with a direct call, without printf() or checking the return value.

Regardless, the patch fixes this as well.


> Well, I was the one that actually messed up the revert ;)

No shame in that, it happens to the best. 404656009b459658138ed1bd18f3c6cf3863e6a6 is a large complex commit as well.

It's really distros who are to blame for shipping somewhat random git non tagged git revision. Then again arch users asked for it :)

debian/sid, ubuntu use 2.35 tagged release and are not affected.

Keep up the good work!
Comment 6 Adhemerval Zanella 2022-06-08 12:36:28 UTC
Fixed on 2.36.