Bringing rseq back into glibc

Thu Nov 18 17:52:18 GMT 2021

----- On Nov 18, 2021, at 11:54 AM, Florian Weimer fweimer@redhat.com wrote:

> * Mathieu Desnoyers:
> 
>> ----- On Nov 18, 2021, at 5:17 AM, Florian Weimer fweimer@redhat.com wrote:

[...]

> 
>>> 3. Implement sched_getcpu on top of rseq.
>>> 
>>> 4. Add public symbols __rseq_abi_offset, __rseq_abi_size (currently 32
>>>   or 0), __rseq_abi_flags (currently 0).  __rseq_abi_offset is the
>>>   offset to add to the thread pointer (see __builtin_thread_pointer) to
>>>   get to the rseq area.  They will be public ABI symbols.  These
>>>   variables are initialized before user code runs, and changing the
>>>   results in undefined behavior.
>>
>> Works for me. So if the Linux kernel eventually implements something along
>> the lines of an extensible kTLS, we can could use that underneath.
>>
>> Small bike-shedding comment: I wonder if we want those public glibc
>> symbols to be called "__rseq_abi_{offset,size,flags}", or if a name like
>> "__ktls_{offset,size,flags}" might be more appropriate and future-proof
>> from a glibc ABI standpoint ?
> 
> No, if the kTLS stuff arrives, it might have different sizes and
> offsets, and the rseq area is just a slice of that.  So the numbers
> could be different.  We could do things as you propose if rseq is
> guaranteed to be at the start of the kernel area, always, but do we know
> that yet?

You're right, we don't. So let's stick with __rseq_abi_.

> 
> Also, kTLS wille likely be called something else to avoid confusion with
> Kernel Transport Layer Security.  That's another reason to stick with
> __rseq_.

Yep.

> 
>>> Steps 1 to 3 are backportable to previous glibc version, especially to
>>> 2.34 with its integrated libpthread.
>>
>> So if we have an application or library already using rseq directly through
>> the system call, upgrading glibc may cause it to fail. Arguably, no new
>> symbol are exposed, so I guess it's OK with the backport guide-lines.
>> My question here is: is it OK for a backported patch to break an
>> application which uses the Linux kernel system calls directly ?
> 
> It depends. 8-)
> 
> I think we can get away with it because shipping software for deployment
> on other people's system must have a fallback path for non-rseq mode
> outside of specialized environments.  For the (hopefully) rare
> exceptions, we'll provide the tunable setting.

Fair enough.

> 
> We must have done it before with similar system calls (set_tid_address,
> set_robust_list).  But system call design tends to avoid creating new
> examples.  rseq is similar to set_tid_address and set_robust_list in
> that more or less has to be this way, with the single-user property.
> (Supporting multiple users is undesirable from a performance/complexity
> perspective.)

Right.

> 
>>> Comments?  As I said, I'd like to bring these changes into glibc 2.35,
>>> hopefully in early December.
>>
>> I won't have time to do the implementation effort myself this time due to
>> other commitments, but I will try to free up some time for review. Feel
>> free to grab whatever code you feel is useful from my earlier rseq
>> integration patches (if any).
> 
> I plan to reuse the architecture-specific marker constants from your
> version at least.  That's already going to save a lot of work.  Thanks.

You're welcome. Let me know if I can be of further assistance.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com