Bug 25772 - [2.27 regression] master branch change 4b246928bd triggering crashes with C++ iterators
Summary: [2.27 regression] master branch change 4b246928bd triggering crashes with C++...
Status: RESOLVED INVALID
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: 2.27
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-03 11:52 UTC by Mikko Rapeli
Modified: 2020-04-03 13:26 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mikko Rapeli 2020-04-03 11:52:45 UTC
Hi,

Using yocto based distro for an x86_target with gcc 7.4. Tried to update glibc from 2.27 release to tip of 2.27 master branch commit 8edc96aa3398b55e635607c9171c02aecdea357f.

Unfortunately this triggers crashes in some testing tools, which I was not expecting for a stable 2.27 update.

Bisected the crashes and commit triggering these is 4b246928bd7ffabad7303dffa84ee542450e56d9:

https://sourceware.org/git/?p=glibc.git;a=commit;h=4b246928bd7ffabad7303dffa84ee542450e56d9

commit 4b246928bd7ffabad7303dffa84ee542450e56d9
Author: DJ Delorie <dj@delorie.com>
Date:   Tue Nov 20 13:24:09 2018 -0500

    malloc: tcache double free check
    
    * malloc/malloc.c (tcache_entry): Add key field.
    (tcache_put): Set it.
    (tcache_get): Likewise.
    (_int_free): Check for double free in tcache.
    * malloc/tst-tcfree1.c: New.
    * malloc/tst-tcfree2.c: New.
    * malloc/Makefile: Run the new tests.
    * manual/probes.texi: Document memory_tcache_double_free probe.
    
    * dlfcn/dlerror.c (check_free): Prevent double frees.
    
    (cherry picked from commit bcdaad21d4635931d1bd3b54a7894276925d081d)
    
    malloc: tcache: Validate tc_idx before checking for double-frees [BZ #23907]
    
    The previous check could read beyond the end of the tcache entry
    array.  If the e->key == tcache cookie check happened to pass, this
    would result in crashes.
    
    (cherry picked from commit affec03b713c82c43a5b025dddc21bde3334f41e)

Reverting this change fixes the crashes.

Example C++ application which triggers this is:

// compile with: g++ -g -O0 test.cpp
#include <stdint.h>
#include <stdio.h>
#include <map>

#ifdef __cplusplus
extern "C" {
#endif

std::map<char*, char*> test_map __attribute__((init_priority(101)));

__attribute__((destructor(200)))
static void cleanup()
{   
    printf("cleanup\n");
    for(auto it = test_map.begin(); it != test_map.end(); it++)
    {   
        if(it->second) printf("second found\n");
    }
    printf("cleanup done\n");
}


int main(void) {
char *foo = "bar";
test_map[foo] = "foo";

// works always:
//cleanup();

return 0;
}

#ifdef __cplusplus
}
#endif

The crash looks like:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b00ba4 in std::_Rb_tree_increment(std::_Rb_tree_node_base*) () from /usr/lib/libstdc++.so.6
(gdb) bt full
#0  0x00007ffff7b00ba4 in std::_Rb_tree_increment(std::_Rb_tree_node_base*) () from /usr/lib/libstdc++.so.6
No symbol table info available.
#1  0x0000555555554e6b in std::_Rb_tree_iterator<std::pair<char* const, char*> >::operator++ (this=0x7fffffffeaf0)
    at /usr/include/c++/7.4.0/bits/stl_tree.h:295
        __tmp = {_M_node = 0x555555759010}
#2  0x0000555555554bbc in cleanup () at test.cpp:15
        it = {_M_node = 0x555555759010}
#3  0x00007ffff7de50e3 in _dl_fini () at /usr/src/debug/glibc/2.27-r0/git/elf/dl-fini.c:138
        array = 0x555555757db8
        i = <optimized out>
        l = 0x7ffff7ffe110
        maps = 0x7fffffffeb10
        i = <optimized out>
        l = <optimized out>
        nmaps = <optimized out>
        nloaded = <optimized out>
        ns = 0
        do_audit = <optimized out>
        __PRETTY_FUNCTION__ = "_dl_fini"
#4  0x00007ffff7136bf1 in __run_exit_handlers (status=0, listp=0x7ffff74ad718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true,
    run_dtors=run_dtors@entry=true) at /usr/src/debug/glibc/2.27-r0/git/stdlib/exit.c:108
        atfct = <optimized out>
        onfct = <optimized out>
        cxafct = <optimized out>
        f = <optimized out>
        new_exitfn_called = 7
        cur = 0x7ffff74aed80 <initial>
#5  0x00007ffff7136cea in __GI_exit (status=<optimized out>) at /usr/src/debug/glibc/2.27-r0/git/stdlib/exit.c:139
No locals.
#6  0x00007ffff7121abe in __libc_start_main (main=0x555555554bcd <main()>, argc=1, argv=0x7fffffffece8, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=0x7fffffffecd8) at /usr/src/debug/glibc/2.27-r0/git/csu/libc-start.c:342
        result = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 5016146419604639475, 93824992234032, 140737488350432, 0, 0, 1209704419353022195,
                1209685001327595251}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7fffffffecf8, 0x7ffff7ffe110}, data = {prev = 0x0, cleanup = 0x0,
              canceltype = -4872}}}
        not_first_call = <optimized out>
#7  0x0000555555554a5a in _start () at ../sysdeps/x86_64/start.S:120
No locals.
Comment 1 Mikko Rapeli 2020-04-03 11:59:47 UTC
I think the test code relies on untrue assumptions. I can fix them, but wanted to get your opinion on this too.
Comment 2 Andreas Schwab 2020-04-03 12:25:56 UTC
$ valgrind ./a.out
==16982== Memcheck, a memory error detector
==16982== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16982== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==16982== Command: ./a.out
==16982== 
cleanup
==16982== Invalid read of size 8
==16982==    at 0x400A68: cleanup (map.cc:17)
==16982==    by 0x400FE62: _dl_fini (dl-fini.c:139)
==16982==    by 0x5751137: __run_exit_handlers (exit.c:83)
==16982==    by 0x5751189: exit (exit.c:105)
==16982==    by 0x5739350: (below main) (libc-start.c:342)
==16982==  Address 0x5ae1ca8 is 40 bytes inside a block of size 48 free'd
==16982==    at 0x4C2FA0B: operator delete(void*) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16982==    by 0x401F89: __gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<char const* const, char const*> > >::deallocate(std::_Rb_tree_node<std::pair<char const* const, char const*> >*, unsigned long) (new_allocator.h:125)
==16982==    by 0x401E36: std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<char const* const, char const*> > > >::deallocate(std::allocator<std::_Rb_tree_node<std::pair<char const* const, char const*> > >&, std::_Rb_tree_node<std::pair<char const* const, char const*> >*, unsigned long) (alloc_traits.h:462)
==16982==    by 0x4018C6: std::_Rb_tree<char const*, std::pair<char const* const, char const*>, std::_Select1st<std::pair<char const* const, char const*> >, std::less<char const*>, std::allocator<std::pair<char const* const, char const*> > >::_M_put_node(std::_Rb_tree_node<std::pair<char const* const, char const*> >*) (stl_tree.h:592)
==16982==    by 0x4011B7: std::_Rb_tree<char const*, std::pair<char const* const, char const*>, std::_Select1st<std::pair<char const* const, char const*> >, std::less<char const*>, std::allocator<std::pair<char const* const, char const*> > >::_M_drop_node(std::_Rb_tree_node<std::pair<char const* const, char const*> >*) (stl_tree.h:659)
==16982==    by 0x400EC1: std::_Rb_tree<char const*, std::pair<char const* const, char const*>, std::_Select1st<std::pair<char const* const, char const*> >, std::less<char const*>, std::allocator<std::pair<char const* const, char const*> > >::_M_erase(std::_Rb_tree_node<std::pair<char const* const, char const*> >*) (stl_tree.h:1858)
==16982==    by 0x400C9B: std::_Rb_tree<char const*, std::pair<char const* const, char const*>, std::_Select1st<std::pair<char const* const, char const*> >, std::less<char const*>, std::allocator<std::pair<char const* const, char const*> > >::~_Rb_tree() (stl_tree.h:949)
==16982==    by 0x402231: std::map<char const*, char const*, std::less<char const*>, std::allocator<std::pair<char const* const, char const*> > >::~map() (stl_map.h:294)
==16982==    by 0x5751137: __run_exit_handlers (exit.c:83)
==16982==    by 0x5751189: exit (exit.c:105)
==16982==    by 0x5739350: (below main) (libc-start.c:342)
==16982==  Block was alloc'd at
==16982==    at 0x4C2E94F: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16982==    by 0x401FD6: __gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<char const* const, char const*> > >::allocate(unsigned long, void const*) (new_allocator.h:111)
==16982==    by 0x401E61: std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<char const* const, char const*> > > >::allocate(std::allocator<std::_Rb_tree_node<std::pair<char const* const, char const*> > >&, unsigned long) (alloc_traits.h:436)
==16982==    by 0x4019BE: std::_Rb_tree<char const*, std::pair<char const* const, char const*>, std::_Select1st<std::pair<char const* const, char const*> >, std::less<char const*>, std::allocator<std::pair<char const* const, char const*> > >::_M_get_node() (stl_tree.h:588)
==16982==    by 0x4012BC: std::_Rb_tree_node<std::pair<char const* const, char const*> >* std::_Rb_tree<char const*, std::pair<char const* const, char const*>, std::_Select1st<std::pair<char const* const, char const*> >, std::less<char const*>, std::allocator<std::pair<char const* const, char const*> > >::_M_create_node<std::piecewise_construct_t const&, std::tuple<char const* const&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple<char const* const&>&&, std::tuple<>&&) (stl_tree.h:642)
==16982==    by 0x4010A3: std::_Rb_tree_iterator<std::pair<char const* const, char const*> > std::_Rb_tree<char const*, std::pair<char const* const, char const*>, std::_Select1st<std::pair<char const* const, char const*> >, std::less<char const*>, std::allocator<std::pair<char const* const, char const*> > >::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<char const* const&>, std::tuple<> >(std::_Rb_tree_const_iterator<std::pair<char const* const, char const*> >, std::piecewise_construct_t const&, std::tuple<char const* const&>&&, std::tuple<>&&) (stl_tree.h:2398)
==16982==    by 0x400E36: std::map<char const*, char const*, std::less<char const*>, std::allocator<std::pair<char const* const, char const*> > >::operator[](char const* const&) (stl_map.h:493)
==16982==    by 0x400AC0: main (map.cc:26)
Comment 3 Carlos O'Donell 2020-04-03 12:30:02 UTC
(In reply to Mikko Rapeli from comment #0)
> std::map<char*, char*> test_map __attribute__((init_priority(101)));

Andreas' valgrind dump shows you that cleanup() runs after the object is destroyed.

Use of init_priority is a GNU C++ extension that allows you to control relative priority of initialization, and only that.

At the return of main() or at a call to exit() the objects are immediately destroyed in reverse order of initialization.

> __attribute__((destructor(200)))

The destructor attribute creates a relative priority ordering of destructor functions that are run at the return of main() or a call to exit().

The destructor is not related to C++ object destruction, and you have no guarantee of ordering relative to the language destruction of objects.

Thus you have no guarantee that your destructor runs before C++ object destructors are called.

The end result is that this code is invalid and cannot be relied upon to do what you think. I doubt it is related to the glibc changes you quoted.
Comment 4 Mikko Rapeli 2020-04-03 12:37:02 UTC
Oh, valgrind does find this!

I was running it with the 4b246928bd7ffabad7303dffa84ee542450e56d9 applied which results in crash and no findings...

Thanks!
Comment 5 Mikko Rapeli 2020-04-03 12:40:26 UTC
This issue was reproducible and thus I was able to bisect the glibc side changes so I'm sure 4b246928bd7ffabad7303dffa84ee542450e56d9 is started triggering the crashes, but the code is also broken and was relying on broken things...
Comment 6 Carlos O'Donell 2020-04-03 12:41:50 UTC
(In reply to Mikko Rapeli from comment #4)
> Oh, valgrind does find this!
> 
> I was running it with the 4b246928bd7ffabad7303dffa84ee542450e56d9 applied
> which results in crash and no findings...

It might be a secondary affect like changing the layout of memory.
Comment 7 Mikko Rapeli 2020-04-03 12:44:09 UTC
Agreed. Thanks for your time and explanations for this issue!
Comment 8 Adhemerval Zanella 2020-04-03 12:56:16 UTC
I think the referenced commit does not pinpoint a glibc issue, but rather your program is relying on specific undefined implementation detail.

Afaik gcc does not specify the order relation between functions with attribute destructor and destructor for classes with static storage duration [1] [2].

For static storage duration destructors, gcc will create functions (__static_initialization_and_destruction*) that will be called by init_array on __libc_csu_init to register constructors and destructors (in your example the std::map one). It is done as being called by atexit, and they functions registered are in reverse order (as for exit documentation).

The attribute destructor functions are called by __do_global_dtors_aux, registered by libgcc in DT_FINI_ARRAY section. It is called by _dl_fini (elf/dl-fini.c), and it is registered *after* libstc++ registers all is atexit functions (through __cxa_atexit).

So to summarize: the std::map destructor is called *before* cleanup function and thus the functions accesses an invalid object state.

[1] https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
[2] https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Attributes.html#C_002b_002b-Attributes