This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] hppa: Optimize atomic_compare_and_exchange_val_acq


On 09/22/2016 10:14 AM, John David Anglin wrote:
> The attached patch replaces the conditional branch tests in
> atomic_compare_and_exchange_val_acq with conditional instruction
> nullification. This avoids the stalls associated with conditional
> branches and the resulting code is shorter. There are no branches in
> the fast path when the operation is successful.

Does this really make a measurable difference? The light-weight-syscall
is probably the most costly part of this entire operation.

If you can show there is a measurable difference I would be willing
to accept the removal of the deadlock looping (it becomes a SIGILL
and you have to look at the core file).

> The change was intended as an optimization but tst-stack4 now passes.

This is a red flag for this patch.

Any idea what changes?
 
The tst-stack4 test creates a bunch of threads, that all create their
own stacks, release them (placing them on the free stack list) and then
they get reused by new threads, all during dlopen/dlsym operation which
is growing the DTV. We try to catch a case where the DTV size is too small
and we overflow. I don't see how that could be related to what you have here?

> 2016-09-22  John David Anglin  
> 
> 	* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: Don't include
> 	abort-instr.h.
> 	(EFAULT): Remove conditional define.
> 	(ENOSYS): Likewise.
> 	(atomic_compare_and_exchange_val_acq): Use instruction nullification
> 	instead of conditional branch instructions.


-- 
Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]