[PATCH 08/08] nptl: arm: Fix Race conditions in pthread cancellation (BZ#12683)
Adhemerval Zanella
adhemerval.zanella@linaro.org
Mon Sep 7 22:24:00 GMT 2015
On 07-09-2015 14:17, Phil Blundell wrote:
> On Mon, 2015-08-31 at 18:11 -0300, Adhemerval Zanella wrote:
>> This patch adds the ARM modifications required for the BZ#12683 fix.
>> It basically removes the enable_asynccancel/disable_asynccancel function
>> usage on code, provide a arch-specific symbol that contains global
>> markers to be used in SIGCANCEL handler.
>
> I looked at this in a bit more detail. Here are some further comments:
>
>> .fnstart; /* matched by the .fnend in UNDOARGS below. */ \
>> - DOCARGS_##args; /* save syscall args etc. around CENABLE. */ \
>> - CENABLE; \
>> - mov ip, r0; /* put mask in safe place. */ \
>> - UNDOCARGS_##args; /* restore syscall args. */ \
>> - ldr r7, =SYS_ify (syscall_name); \
>> - swi 0x0; /* do the call. */ \
>> - mov r7, r0; /* save syscall return value. */ \
>> - mov r0, ip; /* get mask back. */ \
>> - CDISABLE; \
>> - mov r0, r7; /* retrieve return value. */ \
>> - RESTORE_LR_##args; \
>
> After this change, DOCARGS(), UNDOCARGS() and RESTORE_LR() are all dead
> code. Please delete them.
I noted it on your first reply and I already removed them in my branch.
>
>> - UNDOARGS_##args; \
>> + push {r4, r5, lr}; \
>> + .save {r4, r5, lr}; \
>> + PSEUDO_CANCEL_BEFORE; \
>> + movw r0, SYS_ify (syscall_name); \
>
> This fails when compiling for non-thumb2. Please use "ldr r0, =..."
> instead.
>
I will change it
>> + PSEUDO_CANCEL_AFTER; \
>> + pop {r4, r5, pc}; \
>
Indeed 'pc' seems wrong, I think it should be 'lr' instead. I will change it.
> Popping {pc} here causes an immediate return, which means that the errno
> handling code which follows is never executed. Somewhat embarrassingly
> it seems that there is no existing glibc test which catches this
> failure. I've attached a proof of concept which demonstrates it, but I
> rather wonder whether we should extend the test harness so that
> some/most of the existing glibc tests are run both single-threaded (as
> now) and with an additional dummy thread created at startup in order to
> force this code down the multi-threaded path.
>
> Also note that the two testcases in the attached patch give slightly
> different results and I think they would continue to do so (in a
> different way) if the bug above was fixed. It's not entirely clear to
> me that this part of __syscall_cancel() from your other patch is
> correct:
>
> + /* If cancellation is not enabled, call the syscall directly. */
> + if (pd->cancelhandling & CANCELSTATE_BITMASK)
> + {
> + INTERNAL_SYSCALL_DECL (err);
> + result = INTERNAL_SYSCALL_NCS (nr, err, 6, a1, a2, a3, a4, a5, a6);
> + return INTERNAL_SYSCALL_ERROR_P (result, err) ? -result : result;
> + }
>
> p.
>
I will add your testcases on my patch to fix the NPLT tests for this mechanism
[1]. Also the inline cancellation call is fact wrong: it should only negate
the value depending if the architecture set the syscall return to be negate
in case of a error (powerpc for instance).
With these changes:
diff --git a/nptl/libc-cancellation.c b/nptl/libc-cancellation.c
index 5c7d357..9bb8716 100644
--- a/nptl/libc-cancellation.c
+++ b/nptl/libc-cancellation.c
@@ -50,7 +50,9 @@ __syscall_cancel (__syscall_arg_t nr, __syscall_arg_t a1,
{
INTERNAL_SYSCALL_DECL (err);
result = INTERNAL_SYSCALL_NCS (nr, err, 6, a1, a2, a3, a4, a5, a6);
- return INTERNAL_SYSCALL_ERROR_P (result, err) ? -result : result;
+ if (INTERNAL_SYSCALL_ERROR_P (result, err))
+ return -INTERNAL_SYSCALL_ERRNO (result, err);
+ return result;
}
/* Call the arch-specific entry points that contains the globals markers
diff --git a/sysdeps/unix/sysv/linux/arm/sysdep-cancel.h b/sysdeps/unix/sysv/linux/arm/sysdep-cancel.h
index 9f03bb5..6ed3feb 100644
--- a/sysdeps/unix/sysv/linux/arm/sysdep-cancel.h
+++ b/sysdeps/unix/sysv/linux/arm/sysdep-cancel.h
@@ -54,9 +54,9 @@
push {r4, r5, lr}; \
.save {r4, r5, lr}; \
PSEUDO_CANCEL_BEFORE; \
- movw r0, SYS_ify (syscall_name); \
+ ldr r0, =SYS_ify (syscall_name); \
PSEUDO_CANCEL_AFTER; \
- pop {r4, r5, pc}; \
+ pop {r4, r5, lr}; \
.fnend; \
cmn r0, $4096
I saw no regression on arm with both NPTL current tests and the ones
you have added. I also checked on other ports (x86, i386, powerpc64, and aarch64).
Thanks for the review.
[1] https://sourceware.org/ml/libc-alpha/2015-08/msg01287.html
More information about the Libc-alpha
mailing list