Bug 28366 - Calling name() on a locale object (std::locale) with LD_AUDIT library loaded results in SIGSEGV on aarch64 platforms
Summary: Calling name() on a locale object (std::locale) with LD_AUDIT library loaded ...
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: 2.31
: P2 normal
Target Milestone: 2.35
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on: 26643
Blocks:
  Show dependency treegraph
 
Reported: 2021-09-21 21:51 UTC by Nathan Nye
Modified: 2022-03-25 16:40 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2021-09-22 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nathan Nye 2021-09-21 21:51:30 UTC
A bug is preventing CLI utilities such as apt from being profiled via LD_AUDIT:

$ cat auditmin.c # Minimal LD_AUDIT library
unsigned int la_version(unsigned int version) { return version; }
$ gcc -shared -fPIC auditmin.c -o auditmin.so

$ cat crasher.cpp
#include<locale>
int main { std::locale("").name(); }
$ g++ crasher.cpp -o crasher

Without LD_AUDIT loaded (works as intended):
$ ./crasher
$

With LD_AUDIT loaded:
$ LD_AUDIT=$PWD/auditmin.so ./crasher
Segmentation fault (core dumped)
$

gdb reports the issue occurs in std::locale::name in locale.cc (L133).
Comment 1 Florian Weimer 2021-09-22 10:12:40 UTC
Which glibc/GCC version/distribution are you testing? I cannot reproduce this.
Comment 2 Nathan Nye 2021-09-22 20:10:56 UTC
(In reply to Florian Weimer from comment #1)
> Which glibc/GCC version/distribution are you testing? I cannot reproduce
> this.

Interesting! I also couldn't reproduce it on x86_64, I guessed you may be using that. I'm using Ubuntu 20.04.3 on aarch64 (virtualized) which is where the bug is occuring. So this could be an ARM platform issue? My apologies for the typo in my original submission, I meant to say int main().

gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
libc6 2.31-0ubuntu9.2
Comment 4 Adhemerval Zanella 2021-10-06 18:29:43 UTC
(In reply to Nathan Nye from comment #2)
> (In reply to Florian Weimer from comment #1)
> > Which glibc/GCC version/distribution are you testing? I cannot reproduce
> > this.
> 
> Interesting! I also couldn't reproduce it on x86_64, I guessed you may be
> using that. I'm using Ubuntu 20.04.3 on aarch64 (virtualized) which is where
> the bug is occuring. So this could be an ARM platform issue? My apologies
> for the typo in my original submission, I meant to say int main().
> 
> gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> libc6 2.31-0ubuntu9.2

It seems that it the issue Ben Woordard has noted, although I can't make it fails consistently.  But, when it fails on aarch64, it really seems to be due the 'x8' usage:

Program received signal SIGSEGV, Segmentation fault.
std::locale::name[abi:cxx11]() const (this=0xffffffffec60) at /home/adhemerval.zanella/toolchain/src/gcc/libstdc++-v3/src/c++98/locale.cc:133
133             __ret += _S_categories[0];
(gdb) bt
#0  std::locale::name[abi:cxx11]() const (this=0xffffffffec60) at /home/adhemerval.zanella/toolchain/src/gcc/libstdc++-v3/src/c++98/locale.cc:133
#1  0x0000fffff7fb5b20 in ?? ()
#2  0x0000fffff7c3a4cc in __GI___libc_malloc (bytes=1) at malloc.c:3206
#3  0x0000fffff7c3a4cc in __GI___libc_malloc (bytes=281474838073920) at malloc.c:3206
#4  0x0000fffff7fb5adc in ?? ()
Backtrace stopped: not enough registers or memory available to unwind further
(gdb) disas
Dump of assembler code for function std::locale::name[abi:cxx11]() const:
   0x0000fffff7e489f0 <+0>:     stp     x29, x30, [sp, #-96]!
   0x0000fffff7e489f4 <+4>:     mov     x29, sp
   0x0000fffff7e489f8 <+8>:     stp     x19, x20, [sp, #16]
   0x0000fffff7e489fc <+12>:    mov     x19, x8
   0x0000fffff7e48a00 <+16>:    stp     x21, x22, [sp, #32]
   0x0000fffff7e48a04 <+20>:    add     x21, x8, #0x10
   0x0000fffff7e48a08 <+24>:    stp     x23, x24, [sp, #48]
   0x0000fffff7e48a0c <+28>:    mov     x23, x0
   0x0000fffff7e48a10 <+32>:    stp     x25, x26, [sp, #64]
   0x0000fffff7e48a14 <+36>:    stp     x27, x28, [sp, #80]
=> 0x0000fffff7e48a18 <+40>:    strb    wzr, [x8, #16]
Comment 5 Nathan Nye 2021-10-07 15:40:36 UTC
I'm sharing the progress I've made so far on this issue.

It's most likely the same issue Ben Woordard linked (https://www.sourceware.org/bugzilla/show_bug.cgi?id=26643), but the patch doesn't cover this case. As Adhemerval Zanella found, when name() gets called, it tries to read the address at $x8 which was previously overwritten by the dynamic linker as 0x7f7f7f7f7f7f7f7f in strcmp.S and never restored:

#0  strcmp () at ../sysdeps/aarch64/strcmp.S:174
#1  0x0000fffff7fd6140 in check_match (undef_name=undef_name@entry=0xaaaaaaaa04bd "_ZNKSt6locale4nameB5cxx11Ev", ref=ref@entry=0xaaaaaaaa03c0, version=version@entry=0xfffff7ff40d0, flags=flags@entry=1, 
    type_class=type_class@entry=1, sym=0xfffff7c5ab48, symidx=315, strtab=strtab@entry=0xfffff7c7c460 "", map=map@entry=0xfffff7ff69c0, versioned_sym=versioned_sym@entry=0xffffffffeae8, 
    num_versions=num_versions@entry=0xffffffffeae4) at dl-lookup.c:94
#2  0x0000fffff7fd65c8 in do_lookup_x (undef_name=undef_name@entry=0xaaaaaaaa04bd "_ZNKSt6locale4nameB5cxx11Ev", new_hash=new_hash@entry=718167616, old_hash=old_hash@entry=0xffffffffebb8, 
    ref=0xaaaaaaaa03c0, result=result@entry=0xffffffffebc8, scope=<optimized out>, i=1, version=version@entry=0xfffff7ff40d0, flags=flags@entry=1, skip=<optimized out>, skip@entry=0x0, 
    type_class=<optimized out>, type_class@entry=1, undef_map=undef_map@entry=0xfffff7fff200) at dl-lookup.c:436
#3  0x0000fffff7fd6e10 in _dl_lookup_symbol_x (undef_name=0xaaaaaaaa04bd "_ZNKSt6locale4nameB5cxx11Ev", undef_map=undef_map@entry=0xfffff7fff200, ref=ref@entry=0xffffffffecb0, 
    symbol_scope=0xfffff7fff598, version=0xfffff7ff40d0, type_class=type_class@entry=1, flags=1, skip_map=skip_map@entry=0x0) at dl-lookup.c:861
#4  0x0000fffff7fdb1e0 in _dl_profile_fixup (l=0xfffff7fff200, reloc_arg=<optimized out>, retaddr=187649984433000, regs=0xffffffffedc0, framesizep=0xffffffffecf8) at dl-runtime.c:257
#5  0x0000fffff7fe0fa0 in _dl_runtime_profile () at ../sysdeps/aarch64/dl-trampoline.S:221
#6  0x0000aaaaaaaa0b68 in main ()

I'm still searching for what is responsible for restoring $x8 in this instance. At the same time, I'm exploring a couple fixes for existing LD_AUDIT libraries that wouldn't require the linker itself to be patched:

1. Turning profiling off: la_objsearch gets called, but the rest of the RTLD_AUDIT interfaces such as la_symbind{32,64} don't get called. (Fail)

2. Setting the framesizep (stack frame size) to 0 in la_aarch64_gnu_pltenter: Neither the test case of this issue nor the simple one in the linked issue crashes, but this leads to some problems later on. (Fail)

It may be resolved through some combination of la_aarch64_gnu_pltenter and la_aarch64_gnu_pltexit restoring the $x8 register.
Comment 6 Adhemerval Zanella 2021-10-07 19:03:54 UTC
(In reply to Nathan Nye from comment #5)
> I'm sharing the progress I've made so far on this issue.
> 
> It's most likely the same issue Ben Woordard linked
> (https://www.sourceware.org/bugzilla/show_bug.cgi?id=26643), but the patch
> doesn't cover this case. As Adhemerval Zanella found, when name() gets
> called, it tries to read the address at $x8 which was previously overwritten
> by the dynamic linker as 0x7f7f7f7f7f7f7f7f in strcmp.S and never restored:
> 
> #0  strcmp () at ../sysdeps/aarch64/strcmp.S:174
> #1  0x0000fffff7fd6140 in check_match
> (undef_name=undef_name@entry=0xaaaaaaaa04bd "_ZNKSt6locale4nameB5cxx11Ev",
> ref=ref@entry=0xaaaaaaaa03c0, version=version@entry=0xfffff7ff40d0,
> flags=flags@entry=1, 
>     type_class=type_class@entry=1, sym=0xfffff7c5ab48, symidx=315,
> strtab=strtab@entry=0xfffff7c7c460 "", map=map@entry=0xfffff7ff69c0,
> versioned_sym=versioned_sym@entry=0xffffffffeae8, 
>     num_versions=num_versions@entry=0xffffffffeae4) at dl-lookup.c:94
> #2  0x0000fffff7fd65c8 in do_lookup_x
> (undef_name=undef_name@entry=0xaaaaaaaa04bd "_ZNKSt6locale4nameB5cxx11Ev",
> new_hash=new_hash@entry=718167616, old_hash=old_hash@entry=0xffffffffebb8, 
>     ref=0xaaaaaaaa03c0, result=result@entry=0xffffffffebc8, scope=<optimized
> out>, i=1, version=version@entry=0xfffff7ff40d0, flags=flags@entry=1,
> skip=<optimized out>, skip@entry=0x0, 
>     type_class=<optimized out>, type_class@entry=1,
> undef_map=undef_map@entry=0xfffff7fff200) at dl-lookup.c:436
> #3  0x0000fffff7fd6e10 in _dl_lookup_symbol_x (undef_name=0xaaaaaaaa04bd
> "_ZNKSt6locale4nameB5cxx11Ev", undef_map=undef_map@entry=0xfffff7fff200,
> ref=ref@entry=0xffffffffecb0, 
>     symbol_scope=0xfffff7fff598, version=0xfffff7ff40d0,
> type_class=type_class@entry=1, flags=1, skip_map=skip_map@entry=0x0) at
> dl-lookup.c:861
> #4  0x0000fffff7fdb1e0 in _dl_profile_fixup (l=0xfffff7fff200,
> reloc_arg=<optimized out>, retaddr=187649984433000, regs=0xffffffffedc0,
> framesizep=0xffffffffecf8) at dl-runtime.c:257
> #5  0x0000fffff7fe0fa0 in _dl_runtime_profile () at
> ../sysdeps/aarch64/dl-trampoline.S:221
> #6  0x0000aaaaaaaa0b68 in main ()
> 
> I'm still searching for what is responsible for restoring $x8 in this
> instance. At the same time, I'm exploring a couple fixes for existing
> LD_AUDIT libraries that wouldn't require the linker itself to be patched:

If you check the patch [1], it extends the La_aarch64_regs to include 'x8',
which is saved and restored at _dl_profile_fixup.
> 
> 1. Turning profiling off: la_objsearch gets called, but the rest of the
> RTLD_AUDIT interfaces such as la_symbind{32,64} don't get called. (Fail)
> 
> 2. Setting the framesizep (stack frame size) to 0 in
> la_aarch64_gnu_pltenter: Neither the test case of this issue nor the simple
> one in the linked issue crashes, but this leads to some problems later on.
> (Fail)
> 
> It may be resolved through some combination of la_aarch64_gnu_pltenter and
> la_aarch64_gnu_pltexit restoring the $x8 register.

Could you check if the patchset I posted fixed the issue you are seeing? There
is another issue that might interfere with locales usage within audit
modules [2] that the patchset also fixed.


[1] https://patchwork.sourceware.org/project/glibc/patch/20210730194715.881900-21-adhemerval.zanella@linaro.org/
[2] https://patchwork.sourceware.org/project/glibc/patch/20210730194715.881900-6-adhemerval.zanella@linaro.org/
Comment 7 Nathan Nye 2021-10-08 01:09:20 UTC
(In reply to Adhemerval Zanella from comment #6)
> (In reply to Nathan Nye from comment #5)
> > I'm sharing the progress I've made so far on this issue.
> > 
> > It's most likely the same issue Ben Woordard linked
> > (https://www.sourceware.org/bugzilla/show_bug.cgi?id=26643), but the patch
> > doesn't cover this case. As Adhemerval Zanella found, when name() gets
> > called, it tries to read the address at $x8 which was previously overwritten
> > by the dynamic linker as 0x7f7f7f7f7f7f7f7f in strcmp.S and never restored:
> > 
> > #0  strcmp () at ../sysdeps/aarch64/strcmp.S:174
> > #1  0x0000fffff7fd6140 in check_match
> > (undef_name=undef_name@entry=0xaaaaaaaa04bd "_ZNKSt6locale4nameB5cxx11Ev",
> > ref=ref@entry=0xaaaaaaaa03c0, version=version@entry=0xfffff7ff40d0,
> > flags=flags@entry=1, 
> >     type_class=type_class@entry=1, sym=0xfffff7c5ab48, symidx=315,
> > strtab=strtab@entry=0xfffff7c7c460 "", map=map@entry=0xfffff7ff69c0,
> > versioned_sym=versioned_sym@entry=0xffffffffeae8, 
> >     num_versions=num_versions@entry=0xffffffffeae4) at dl-lookup.c:94
> > #2  0x0000fffff7fd65c8 in do_lookup_x
> > (undef_name=undef_name@entry=0xaaaaaaaa04bd "_ZNKSt6locale4nameB5cxx11Ev",
> > new_hash=new_hash@entry=718167616, old_hash=old_hash@entry=0xffffffffebb8, 
> >     ref=0xaaaaaaaa03c0, result=result@entry=0xffffffffebc8, scope=<optimized
> > out>, i=1, version=version@entry=0xfffff7ff40d0, flags=flags@entry=1,
> > skip=<optimized out>, skip@entry=0x0, 
> >     type_class=<optimized out>, type_class@entry=1,
> > undef_map=undef_map@entry=0xfffff7fff200) at dl-lookup.c:436
> > #3  0x0000fffff7fd6e10 in _dl_lookup_symbol_x (undef_name=0xaaaaaaaa04bd
> > "_ZNKSt6locale4nameB5cxx11Ev", undef_map=undef_map@entry=0xfffff7fff200,
> > ref=ref@entry=0xffffffffecb0, 
> >     symbol_scope=0xfffff7fff598, version=0xfffff7ff40d0,
> > type_class=type_class@entry=1, flags=1, skip_map=skip_map@entry=0x0) at
> > dl-lookup.c:861
> > #4  0x0000fffff7fdb1e0 in _dl_profile_fixup (l=0xfffff7fff200,
> > reloc_arg=<optimized out>, retaddr=187649984433000, regs=0xffffffffedc0,
> > framesizep=0xffffffffecf8) at dl-runtime.c:257
> > #5  0x0000fffff7fe0fa0 in _dl_runtime_profile () at
> > ../sysdeps/aarch64/dl-trampoline.S:221
> > #6  0x0000aaaaaaaa0b68 in main ()
> > 
> > I'm still searching for what is responsible for restoring $x8 in this
> > instance. At the same time, I'm exploring a couple fixes for existing
> > LD_AUDIT libraries that wouldn't require the linker itself to be patched:
> 
> If you check the patch [1], it extends the La_aarch64_regs to include 'x8',
> which is saved and restored at _dl_profile_fixup.
> > 
> > 1. Turning profiling off: la_objsearch gets called, but the rest of the
> > RTLD_AUDIT interfaces such as la_symbind{32,64} don't get called. (Fail)
> > 
> > 2. Setting the framesizep (stack frame size) to 0 in
> > la_aarch64_gnu_pltenter: Neither the test case of this issue nor the simple
> > one in the linked issue crashes, but this leads to some problems later on.
> > (Fail)
> > 
> > It may be resolved through some combination of la_aarch64_gnu_pltenter and
> > la_aarch64_gnu_pltexit restoring the $x8 register.
> 
> Could you check if the patchset I posted fixed the issue you are seeing?
> There
> is another issue that might interfere with locales usage within audit
> modules [2] that the patchset also fixed.
> 
> 
> [1]
> https://patchwork.sourceware.org/project/glibc/patch/20210730194715.881900-
> 21-adhemerval.zanella@linaro.org/
> [2]
> https://patchwork.sourceware.org/project/glibc/patch/20210730194715.881900-6-
> adhemerval.zanella@linaro.org/

I had no idea the patch I was using (V2 from the Sourceware issue) was out of date! When I applied both of the patches you linked (from Patchwork) the problem is now fully resolved. Incredible! I'll still be looking for a way to provide backwards compatibility in existing LD_AUDIT libraries, which may be desirable for projects such as HPCTookit. For now it seems this issue is a dupe of existing issues opened by you and Ben! Thank you very much Adhemerval.
Comment 8 Szabolcs Nagy 2022-03-25 16:40:54 UTC
I believe this is fixed in 2.35 after commit

https://sourceware.org/git/?p=glibc.git;a=commit;h=ce9a68c57c260c8417afc93972849ac9ad243ec4

elf: Fix runtime linker auditing on aarch64 (BZ #26643)