[PATCH v13 4/6] nptl: Introduce RSEQ_GETMEM_VOLATILE and RSEQ_SETMEM

Mathieu Desnoyers mathieu.desnoyers@efficios.com
Wed Nov 20 14:12:23 GMT 2024


On 2024-11-19 18:45, Florian Weimer wrote:
> * Michael Jeanson:
> 
>> Currently the only 64bit field is rseq_cs and we don't use it since there
>> are no rseq critical sections in glibc yet. For this patch series we could
>> simply _Static_assert when size == 8 and then add a proper implementation
>> when we do start using rseq_cs.
> 
> The upper half of rseq_cs will be stuck at 0 on 32-bit platforms, right?

Correct.

> So there is no actual need for 64-bit atomics, it's just an oddity
> resulting from the programming interface.

Yes. In librseq, I work-around this oddity by defining the rseq_cs field as
follows to accommodate both 32-bit and 64-bit environments:

         /*
          * Restartable sequences rseq_cs field.
          *
          * Contains NULL when no critical section is active for the current
          * thread, or holds a pointer to the currently active struct rseq_abi_cs.
          *
          * Updated by user-space, which sets the address of the currently
          * active rseq_cs at the beginning of assembly instruction sequence
          * block, and set to NULL by the kernel when it restarts an assembly
          * instruction sequence block, as well as when the kernel detects that
          * it is preempting or delivering a signal outside of the range
          * targeted by the rseq_cs. Also needs to be set to NULL by user-space
          * before reclaiming memory that contains the targeted struct rseq_abi_cs.
          *
          * Read and set by the kernel. Set by user-space with single-copy
          * atomicity semantics. This field should only be updated by the
          * thread which registered this data structure. Aligned on 64-bit.
          *
          * 32-bit architectures should update the low order bits of the
          * rseq_cs field, leaving the high order bits initialized to 0.
          */
         union {
                 __u64 ptr64;

                 /*
                  * The "arch" field provides architecture accessor for
                  * the ptr field based on architecture pointer size and
                  * endianness.
                  */
                 struct {
#ifdef __LP64__
                         __u64 ptr;
#elif defined(__BYTE_ORDER) ? (__BYTE_ORDER == __BIG_ENDIAN) : defined(__BIG_ENDIAN)
                         __u32 padding;          /* Initialized to zero. */
                         __u32 ptr;
#else
                         __u32 ptr;
                         __u32 padding;          /* Initialized to zero. */
#endif
                 } arch;
         } rseq_cs;

So we can use rseq_cs.ptr64 on both 32-bit and 64-bit architectures to initialize
to 0 before registration (no need for single-copy semantic).

We can use rseq_cs.arch.ptr upon entering rseq critical sections to set the
pointer to the rseq_cs descriptor. It performs a 64-bit store on 64-bit
architectures, and a 32-bit store to the LSB on 32-bit architectures.

So going back to the proposed patch here, my main concern is to falsely
claim that RSEQ_{GET,SET}MEM_VOLATILE provide single-copy atomicity
guarantees for 64-bit values on 32-bit architectures, which can lead to
subtle bugs in the future.

One approach we can take is to allow the following value size for each
macro:

32-bit:

   RSEQ_{GET,SET}MEM():          accepted size (bytes): 1, 4, 8
   RSEQ_{GET,SET}MEM_VOLATILE(): accepted size (bytes): 1, 4

64-bit:

   RSEQ_{GET,SET}MEM():          accepted size (bytes): 1, 4, 8
   RSEQ_{GET,SET}MEM_VOLATILE(): accepted size (bytes): 1, 4, 8

Both volatile and non-volatile APIs could use a common asm volatile
implementation under the hood: the only thing we really want to
specialize between volatile and non-volatile is the compile-time size
check.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com



More information about the Libc-alpha mailing list