RSEQ symbols: __rseq_size, __rseq_flags vs __rseq_feature_size
Mathieu Desnoyers
mathieu.desnoyers@efficios.com
Fri Sep 16 14:36:46 GMT 2022
Hi Florian,
I wanted to clarify by email what we each have in mind with respect
to exposing the RSEQ feature set available to the outside world
through libc symbols.
I have 3 different possible approaches in mind, shown below with
3 examples:
#include <stdint.h>
#undef likely
#define likely(x) __builtin_expect(!!(x), 1)
#undef __aligned
#define __aligned(x) __attribute__((__aligned__(x)))
#undef offsetof
#define offsetof(TYPE, MEMBER) __builtin_offsetof(TYPE, MEMBER)
#undef sizeof_field
#define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
#undef offsetofend
#define offsetofend(TYPE, MEMBER) \
(offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
#define __RSEQ_FLAG_FEATURE_EXTENDED 0x2
#define __RSEQ_FLAG_FEATURE_VM_VCPU_ID 0x4
typedef uint32_t __u32;
typedef uint64_t __u64;
/* Original: size=32 bytes */
struct rseq_orig {
uint32_t cpu_id_start;
uint32_t cpu_id;
uint64_t rseq_cs;
uint32_t flags;
uint32_t padding[3];
} __aligned(32);
/* Extended */
struct rseq_ext {
uint32_t cpu_id_start;
uint32_t cpu_id;
uint64_t rseq_cs;
uint32_t flags;
/* New */
uint32_t node_id;
uint32_t vm_vcpu_id;
uint32_t padding[1];
} __aligned(32);
unsigned int __rseq_flags;
unsigned int __rseq_size;
unsigned int __rseq_feature_size;
/* A) Check extended feature flag and size. One mask and two comparisons. */
void fA(void)
{
if (likely((__rseq_flags & __RSEQ_FLAG_FEATURE_EXTENDED)
&& __rseq_size >= offsetofend(struct rseq_ext, vm_vcpu_id))) {
/* Use rseq with vcpu_id. */
asm volatile ("ud2\n\t");
} else {
/* Fallback. */
asm volatile ("int3\n\t");
}
}
/*
* B) Check rseq feature size. Feature number only limited by size of
* uint32_t. One comparison.
*/
void fB(void)
{
if (likely(__rseq_feature_size >= offsetofend(struct rseq_ext, vm_vcpu_id))) {
/* Use rseq with vcpu_id. */
asm volatile ("ud2\n\t");
} else {
/* Fallback. */
asm volatile ("int3\n\t");
}
}
/*
* C) Check only rseq flags. 32 features at most. One mask and one
* comparison.
*/
void fC(void)
{
if (likely(__rseq_flags & __RSEQ_FLAG_FEATURE_VM_VCPU_ID)) {
/* Use rseq with vcpu_id. */
asm volatile ("ud2\n\t");
} else {
/* Fallback. */
asm volatile ("int3\n\t");
}
Here is the resulting objdump:
rseq-flags.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <fA>:
0: f6 05 00 00 00 00 02 testb $0x2,0x0(%rip) # 7 <fA+0x7>
7: 74 0f je 18 <fA+0x18>
9: 83 3d 00 00 00 00 1b cmpl $0x1b,0x0(%rip) # 10 <fA+0x10>
10: 76 06 jbe 18 <fA+0x18>
12: 0f 0b ud2
14: c3 retq
15: 0f 1f 00 nopl (%rax)
18: cc int3
19: c3 retq
1a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
0000000000000020 <fB>:
20: 83 3d 00 00 00 00 1b cmpl $0x1b,0x0(%rip) # 27 <fB+0x7>
27: 76 07 jbe 30 <fB+0x10>
29: 0f 0b ud2
2b: c3 retq
2c: 0f 1f 40 00 nopl 0x0(%rax)
30: cc int3
31: c3 retq
32: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
39: 00 00 00 00
3d: 0f 1f 00 nopl (%rax)
0000000000000040 <fC>:
40: f6 05 00 00 00 00 04 testb $0x4,0x0(%rip) # 47 <fC+0x7>
47: 74 07 je 50 <fC+0x10>
49: 0f 0b ud2
4b: c3 retq
4c: 0f 1f 40 00 nopl 0x0(%rax)
50: cc int3
51: c3 retq
I can think of 4 approaches that applications will use to detect
availability of their specific rseq feature for each rseq critical
section:
1) Dynamically check whether the feature is implemented at runtime
with conditional branches. Those using this approach will probably
not want to have the overhead of the two comparisons in approach (A)
above. Applications and libraries should probably use their own copy
of the glibc symbols for speed purposes.
2) Implement the entire function as IFUNC and select whether a rseq or
non-rseq implementation should be used at C startup. The tradeoff
here is code size vs speed, and using IFUNC for things like malloc
may add additional constraints on the startup order.
3) Code rewrite (dynamic code patching) between rseq and non-rseq code.
This may be frowned upon in the security area and may not always be
possible depending on the context.
3) JIT compilation of specialized rseq vs non-rseq code. Not generally
available in C.
I suspect that glibc may rely on approaches 1+2 depending on the
situation, and many applications may use approach (1) for simplicity
reasons.
Ideally I would like to keep approach (1) fast, so I'd prefer to
keep the check to one single conditional branch. This eliminates
approach (A) and leaves approaches (B) and (C). Approach (B) has
the advantage of not limiting us to 32 features, but its downside
is that we need to introduce a new __rseq_feature_size symbol to
the libc ABI. Approach (C) has the advantage of using __rseq_flags
which is already exposed, but limits us to 32 features.
Did you have in mind an approach like (A), (B) or (C) for exposing
the rseq feature set or something else entirely ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
More information about the Libc-alpha
mailing list