This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/5] glibc: Perform rseq(2) registration at C startup and thread creation (v8)


----- On Apr 17, 2019, at 3:56 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On Apr 17, 2019, at 12:17 PM, Joseph Myers joseph@codesourcery.com wrote:
> 
>> On Wed, 17 Apr 2019, Mathieu Desnoyers wrote:
>> 
>>> > +/* RSEQ_SIG is a signature required before each abort handler code.
>>> > +
>>> > +   It is a 32-bit value that maps to actual architecture code compiled
>>> > +   into applications and libraries. It needs to be defined for each
>>> > +   architecture. When choosing this value, it needs to be taken into
>>> > +   account that generating invalid instructions may have ill effects on
>>> > +   tools like objdump, and may also have impact on the CPU speculative
>>> > +   execution efficiency in some cases.  */
>>> > +
>>> > +#define RSEQ_SIG 0xd428bc00	/* BRK #0x45E0.  */
>>> 
>>> After further investigation, we should probably do the following
>>> to handle compiling with -mbig-endian on aarch64, which generates
>>> binaries with mixed code vs data endianness (little endian code,
>>> big endian data):
>> 
>> First, the comment on RSEQ_SIG should specify whether it is to be
>> interpreted in the code or the data endianness.
> 
> Right. The signature passed as argument to the rseq registration
> system call needs to be in data endianness (currently exposed kernel
> ABI).
> 
> Ideally for userspace, we want to define a signature in code endianness
> that happens to nicely match specific code patterns.
> 
>> 
>>> For ARM32, the situation is a bit more complex. Only armv6+
>>> generates mixed-endianness code vs data with -mbig-endian.
>>> Prior to armv6, the code and data endianness matches. Therefore,
>>> I plan to #ifdef the reversed endianness handling with:
>>> 
>>> #if __ARM_ARCH >= 6 && __ARM_BIG_ENDIAN
>>> 
>>> on arm32.
>> 
>> That doesn't work well because BE code (.o files) can be built for v5te
>> (for example) and used on a range of different architecture variants with
>> both BE32 and BE8 - the choice between BE32 and BE8 is a link-time choice,
>> not a compile-time choice.  So if the value for Arm is a compile-time
>> constant, it should also work for both BE32 and BE8.
> 
> Good to know! Then we need to be even more careful.
> 
>> 
>> In turn, that suggests to me that RSEQ_SIG should be defined to be a value
>> that is always in the code endianness (and whatever corresponding kernel
>> code handles RSEQ_SIG values should act accordingly on architectures where
>> the two endiannesses can differ).  If the kernel ABI is already fixed in a
>> way that prevents such a definition of RSEQ_SIG semantics as using code
>> endianness, a value should be chosen for Arm that works for both
>> endiannesses.
> 
> It might be tricky to pick up a trap instruction that is a palindrome
> endianness-wise.
> 
>> 
>> (Also, installed glibc headers are supposed to work with older compilers,
>> and support for __ARM_ARCH was only added in GCC 4.8.  Before that you
>> need to test lots of separate macros for different architecture variants
>> to determine a version number.)
> 
> Good point!
> 
> Here is an alternative to the palindrome approach. I'm taking arm32
> as an example:
> 
> * We define RSEQ_SIG_CODE in code endianness, meant to be used with
>  .inst in rseq assembly:
> 
> #define RSEQ_SIG_CODE 0xe7f5def3
> 
> * We define RSEQ_SIG_DATA in data endianness:
> 
> #define RSEQ_SIG_DATA \
>        ({ \
>                int sig; \
>                asm volatile (  "b 2f\n\t" \
>                                ".arm\n\t" \
>                                "1: .inst 0xe7f5def3\n\t" \
>                                "2:\n\t" \
>                                "ldr %[sig], 1b\n\t" \
>                                : [sig] "=r" (sig)); \
>                sig; \
>        })
> 
> Technically, only glibc and early-adopter libraries wishing to
> register rseq need to use RSEQ_SIG_DATA. The RSEQ_SIG_CODE needs
> to be used from inline assembly to create the signatures before
> each abort handler.

The approach above should work for arm32 be8 vs be32 linker weirdness.

For aarch64, I think we can simply do:

/*
 * aarch64 -mbig-endian generates mixed endianness code vs data:
 * little-endian code and big-endian data. Ensure the RSEQ_SIG signature
 * matches code endianness.
 */
#define RSEQ_SIG_CODE   0xd428bc00      /* BRK #0x45E0.  */

#ifdef __ARM_BIG_ENDIAN
#define RSEQ_SIG_DATA   0x00bc28d4      /* BRK #0x45E0.  */
#else
#define RSEQ_SIG_DATA   RSEQ_SIG_CODE
#endif

#define RSEQ_SIG        RSEQ_SIG_DATA

Feedback is most welcome,

Thanks!

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]