[PATCH v13 4/6] nptl: Introduce RSEQ_GETMEM_VOLATILE and RSEQ_SETMEM
Mathieu Desnoyers
mathieu.desnoyers@efficios.com
Wed Nov 20 14:12:23 GMT 2024
On 2024-11-19 18:45, Florian Weimer wrote:
> * Michael Jeanson:
>
>> Currently the only 64bit field is rseq_cs and we don't use it since there
>> are no rseq critical sections in glibc yet. For this patch series we could
>> simply _Static_assert when size == 8 and then add a proper implementation
>> when we do start using rseq_cs.
>
> The upper half of rseq_cs will be stuck at 0 on 32-bit platforms, right?
Correct.
> So there is no actual need for 64-bit atomics, it's just an oddity
> resulting from the programming interface.
Yes. In librseq, I work-around this oddity by defining the rseq_cs field as
follows to accommodate both 32-bit and 64-bit environments:
/*
* Restartable sequences rseq_cs field.
*
* Contains NULL when no critical section is active for the current
* thread, or holds a pointer to the currently active struct rseq_abi_cs.
*
* Updated by user-space, which sets the address of the currently
* active rseq_cs at the beginning of assembly instruction sequence
* block, and set to NULL by the kernel when it restarts an assembly
* instruction sequence block, as well as when the kernel detects that
* it is preempting or delivering a signal outside of the range
* targeted by the rseq_cs. Also needs to be set to NULL by user-space
* before reclaiming memory that contains the targeted struct rseq_abi_cs.
*
* Read and set by the kernel. Set by user-space with single-copy
* atomicity semantics. This field should only be updated by the
* thread which registered this data structure. Aligned on 64-bit.
*
* 32-bit architectures should update the low order bits of the
* rseq_cs field, leaving the high order bits initialized to 0.
*/
union {
__u64 ptr64;
/*
* The "arch" field provides architecture accessor for
* the ptr field based on architecture pointer size and
* endianness.
*/
struct {
#ifdef __LP64__
__u64 ptr;
#elif defined(__BYTE_ORDER) ? (__BYTE_ORDER == __BIG_ENDIAN) : defined(__BIG_ENDIAN)
__u32 padding; /* Initialized to zero. */
__u32 ptr;
#else
__u32 ptr;
__u32 padding; /* Initialized to zero. */
#endif
} arch;
} rseq_cs;
So we can use rseq_cs.ptr64 on both 32-bit and 64-bit architectures to initialize
to 0 before registration (no need for single-copy semantic).
We can use rseq_cs.arch.ptr upon entering rseq critical sections to set the
pointer to the rseq_cs descriptor. It performs a 64-bit store on 64-bit
architectures, and a 32-bit store to the LSB on 32-bit architectures.
So going back to the proposed patch here, my main concern is to falsely
claim that RSEQ_{GET,SET}MEM_VOLATILE provide single-copy atomicity
guarantees for 64-bit values on 32-bit architectures, which can lead to
subtle bugs in the future.
One approach we can take is to allow the following value size for each
macro:
32-bit:
RSEQ_{GET,SET}MEM(): accepted size (bytes): 1, 4, 8
RSEQ_{GET,SET}MEM_VOLATILE(): accepted size (bytes): 1, 4
64-bit:
RSEQ_{GET,SET}MEM(): accepted size (bytes): 1, 4, 8
RSEQ_{GET,SET}MEM_VOLATILE(): accepted size (bytes): 1, 4, 8
Both volatile and non-volatile APIs could use a common asm volatile
implementation under the hood: the only thing we really want to
specialize between volatile and non-volatile is the compile-time size
check.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
More information about the Libc-alpha
mailing list