This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v3 18/21] nptl: s390: Fix Race conditions in pthread cancellation (BZ#12683)


On 10/17/19 9:46 PM, Adhemerval Zanella wrote:


On 17/10/2019 12:00, Stefan Liebler wrote:
On 10/17/19 3:53 PM, Adhemerval Zanella wrote:


On 16/10/2019 12:46, Stefan Liebler wrote:
Hi Adhemerval,

I've added some notes below to the s390-64 file, but the same applies also for s390-32. I've also attached a diff.

I've also recognized that a call starting from e.g. write () involves various shuffling of the argument registers at each level:
write (ARGS in r2-r4)
-> __syscall_cancel (r2=nr, ARGS in r3-r6 and two stack-slots)
--> __syscall_cancel_arch (r2=*ch, r3=nr, ARGS in r4-r6 and three stack-slots)
---> "syscall-instruction" (ARGS in r2-r7)

Just as a quick idea (I don't know if there are other limitations), those shuffling instructions could perhaps be omitted if the nr / ch arguments of the __syscall_cancel / __syscall_cancel_arch functions would be the last arguments instead of the first ones.
I assume that also other archs could benefit from such an ordering.

Thanks, I have applied your changes.  Indeed for some architectures the
syscall_cancel.S might not be the most optimized one, I used the reference
C implementation as base and gcc might not generate the best code in some
cases.

The implementation in syscall_cancel.S contains just one of the three parts of the mentioned register move instructions. Those needs also to be generated in write() and __syscall_cancel().
My point was, if we could preserve the values passed in registers to e.g. write() in the same registers until the syscall is invoked in __syscall_cancel_arch, we won't need to move them from registers to registers at all:
write (ARGS in r2-r4)
-> __syscall_cancel (ARGS in r2-r6 and first stack-slot, nr in second stack-slot)
--> __syscall_cacnel_arch (ARGS in r2-r6 and first stack-slot, nr in second stack-slot, ch in third stack-slot)
---> "syscall-instruction" (ARGS in r2-r7)

Well I think it might be feasible, so for instance write would call:

ssize_t _libc_write (int fd, const void *buf, size_t nbytes)
\_ __syscall_cancel (fd, buf, nbytes, 0, 0, 0, __NR_write)
    \_ __syscall_cancel_arch (fd, buf, nbytes, 0, 0, 0, __NR_write, &pd->cancellation)
       ...

I don't have a strong opinion here, I would expect that since the
cancellation points are potentially blocking syscall both the
latency of the syscall itself and the potentially unbounded time
spent in kernel might shadow any potentially micro-optimization in
argument shuffling.
Sure. The syscall spends the most time.
Beneath the timing aspect, this also affects the code-size in various places. And if somebody reads / debugs the code starting from e.g. write until the syscall, the instruction sequence looks confusing. (Not only on s390)


Also, this patch has the advantage of removing the atomic operation
from __pthread_{enable,disable}_asynccancel, which for some
architectures are way costly than register shuffling or spilling.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]