Bug 26387

Summary: Assertion `!frame_id_eq (*this_id, outer_frame_id)' failed triggered when backtracing a green thread
Product: gdb Reporter: Botond Dénes <dns.botond>
Component: c++Assignee: Not yet assigned to anyone <unassigned>
Status: UNCONFIRMED ---    
Severity: normal CC: simark, tromey
Priority: P2    
Version: 9.1   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: Source

Description Botond Dénes 2020-08-14 06:50:58 UTC
When backtracing inside a seastar::thread, a green thread implementation, gdb fails as below:

(gdb) bt
#0  thread_bar (hours=10) at exp.cpp:6
#1  0x0000000000426ef1 in thread_foo (hours=10) at exp.cpp:10
#2  0x0000000000426f0d in thread_main () at exp.cpp:15
#3  0x0000000000426f21 in operator() (__closure=0x600000084250) at exp.cpp:22
#4  0x00000000004287d1 in std::__invoke_impl<void, main(int, char**)::<lambda()>::<lambda()> >(std::__invoke_other, struct {...} &&) (__f=...) at /usr/include/c++/10/bits/invoke.h:60
#5  0x0000000000428210 in std::__invoke<main(int, char**)::<lambda()>::<lambda()> >(struct {...} &&) (__fn=...) at /usr/include/c++/10/bits/invoke.h:95
#6  0x0000000000427a92 in std::__apply_impl<main(int, char**)::<lambda()>::<lambda()>, std::tuple<> >(struct {...} &&, std::tuple<> &&, std::index_sequence) (__f=..., __t=...) at /usr/include/c++/10/tuple:1723
#7  0x0000000000427acc in std::apply<main(int, char**)::<lambda()>::<lambda()>, std::tuple<> >(struct {...} &&, std::tuple<> &&) (__f=..., __t=...) at /usr/include/c++/10/tuple:1734
#8  0x0000000000427b0f in seastar::futurize<void>::apply<main(int, char**)::<lambda()>::<lambda()> >(struct {...} &&, std::tuple<> &&) (func=..., args=...) at /home/bdenes/ScyllaDB/seastar/include/seastar/core/future.hh:1989
#9  0x000000000042722f in operator() (this=0x60000019b3a0) at /home/bdenes/ScyllaDB/seastar/include/seastar/core/thread.hh:259
#10 0x0000000000428db1 in seastar::noncopyable_function<void()>::direct_vtable_for<seastar::async<main(int, char**)::<lambda()>::<lambda()>, {}>::<lambda()> >::call(const seastar::noncopyable_function<void()> *) (
../../gdb/inline-frame.c:159: internal-error: void inline_frame_this_id(frame_info*, void**, frame_id*): Assertion `!frame_id_eq (*this_id, outer_frame_id)' failed.

The stacktrace should have one more frame, seastar::thread::main(), which is the entry point of the green thread, and which is annotated with `asm(".cfi_undefined rip")`, as follows:

```c++
  6 void                                                                                                                                                                                                                                                      
  5 thread_context::main() {                                                                                                                                                                                                                                  
  4 #ifdef __x86_64__                                                                                                                                                                                                                                         
  3     // There is no caller of main() in this context. We need to annotate this frame like this so that                                                                                                                                                     
  2     // unwinders don't try to trace back past this frame.                                                                                                                                                                                                 
  1     // See https://github.com/scylladb/scylla/issues/1909.                                                                                                                                                                                                
284     asm(".cfi_undefined rip");                                                                                                                                                                                                                            
  1 #elif defined(__PPC__)                                                                                                                                                                                                                                    
  2     asm(".cfi_undefined lr");                                                                                                                                                                                                                             
  3 #elif defined(__aarch64__)                                                                                                                                                                                                                                
  4     asm(".cfi_undefined x30");                                                                                                                                                                                                                            
  5 #else                                                                                                                                                                                                                                                     
  6     #warning "Backtracing from seastar threads may be broken"                                                                                                                                                                                             
  7 #endif                                                                                                                                                                                                                                                    
  8     _context.initial_switch_in_completed();                                                                                                                                                                                                               
  9     if (group() != current_scheduling_group()) {                                                                                                                                                                                                          
 10         yield();                                                                                                                                                                                                                                          
 11     }                                                                                                                                                                                                                                                     
 12     try {                                                                                                                                                                                                                                                 
 13         _func();                                                                                                                                                                                                                                          
 14         _done.set_value();                                                                                                                                                                                                                                
 15     } catch (...) {                                                                                                                                                                                                                                       
 16         _done.set_exception(std::current_exception());                                                                                                                                                                                                    
 17     }                                                                                                                                                                                                                                                     
 18                                                                                                                                                                                                                                                           
 19     _context.final_switch_out();                                                                                                                                                                                                                          
 20 } 
```

gdb stacktrace:
(gdb) bt
#0  0x00007f701b4909e5 in raise () from /lib64/libc.so.6
#1  0x00007f701b479895 in abort () from /lib64/libc.so.6
#2  0x000055730247169d in dump_core() ()
#3  0x0000557302477365 in internal_vproblem(internal_problem*, char const*, int, char const*, __va_list_tag*) ()
#4  0x0000557302477581 in internal_verror(char const*, int, char const*, __va_list_tag*) ()
#5  0x00005573021e91a5 in internal_error(char const*, int, char const*, ...) ()
#6  0x0000557302256ee6 in inline_frame_this_id(frame_info*, void**, frame_id*) ()
#7  0x00005573021cecca in compute_frame_id(frame_info*) ()
#8  0x00005573021cf308 in get_prev_frame_if_no_cycle(frame_info*) ()
#9  0x00005573021d1780 in get_prev_frame_always(frame_info*) ()
#10 0x00005573021d2b4d in get_frame_unwind_stop_reason(frame_info*) ()
#11 0x0000557302167315 in dwarf2_frame_cfa(frame_info*) ()
#12 0x000055730216e3ed in dwarf_expr_context::execute_stack_op(unsigned char const*, unsigned char const*) ()
#13 0x000055730216e9f6 in dwarf_expr_context::execute_stack_op(unsigned char const*, unsigned char const*) ()
#14 0x000055730216f5a8 in dwarf_expr_context::eval(unsigned char const*, unsigned long) ()
#15 0x0000557302172598 in dwarf2_evaluate_loc_desc_full(type*, frame_info*, unsigned char const*, unsigned long, dwarf2_per_cu_data*, type*, long) ()
#16 0x0000557302173175 in locexpr_read_variable(symbol*, frame_info*) ()
#17 0x00005573023d90d4 in read_frame_arg(frame_print_options const&, symbol*, frame_info*, frame_arg*, frame_arg*) ()
#18 0x00005573023d9c35 in print_frame_args(frame_print_options const&, symbol*, frame_info*, int, ui_file*) ()
#19 0x00005573023dd7f9 in print_frame_info(frame_print_options const&, frame_info*, int, print_what, int, int) ()
#20 0x00005573023de898 in backtrace_command(char const*, int) ()
#21 0x0000557302101b4a in cmd_func(cmd_list_element*, char const*, int) ()
#22 0x00005573024352ed in execute_command(char const*, int) ()
#23 0x00005573021bdf25 in command_handler(char const*) ()
#24 0x00005573021bf321 in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) ()
#25 0x00005573021be852 in gdb_rl_callback_handler(char*) ()
#26 0x00007f701c1b90ae in rl_callback_read_char () from /lib64/libreadline.so.8
#27 0x00005573021bda46 in gdb_rl_callback_read_char_wrapper_noexcept() ()
#28 0x00005573021be705 in gdb_rl_callback_read_char_wrapper(void*) ()
#29 0x00005573021bd898 in stdin_event_handler(int, void*) ()
#30 0x00005573021bc6f6 in gdb_wait_for_event(int) [clone .part.0] ()
#31 0x00005573021bcad0 in gdb_do_one_event() ()
#32 0x00005573021bcba5 in start_event_loop() ()
#33 0x00005573022912cb in captured_command_loop() ()
#34 0x0000557302293b65 in gdb_main(captured_main_args*) ()
#35 0x0000557302039550 in main ()

I attached the gdb coredump, the application (and its source) that reproduces the bug. To build the application build seastar, then:

$ g++ exp.cpp $(pkg-config --libs --cflags --static /home/bdenes/ScyllaDB/seastar/build/release/seastar.pc) -O0 -g -std=c++20 -Wfatal-errors -o exp

$ gdb ./exp
(gdb) b exp.cpp:11
(gdb) r
(gdb) bt
Comment 1 Botond Dénes 2020-08-14 06:57:44 UTC
Created attachment 12770 [details]
Source

The coredump and the executable are too large to attach unfortunately, so for now I'm attaching the source file only. Built on Fedora 32.
Comment 2 Simon Marchi 2020-08-14 11:27:45 UTC
This looks really similar to something I'm working on right now.  GDB currently does not support having a frame inlined into the outer frame.

There was a first version of a patch sent here:

https://sourceware.org/pipermail/gdb-patches/2020-March/166786.html

(the discussion continues in the following months, the web archive does not automatically link them together)

I have a patch series almost ready to send that might address your problem.
Comment 3 Simon Marchi 2020-08-14 11:28:47 UTC
(In reply to Botond Dénes from comment #1)
> Created attachment 12770 [details]
> Source
> 
> The coredump and the executable are too large to attach unfortunately, so
> for now I'm attaching the source file only. Built on Fedora 32.

Can you give the full compilation line you use?  As well as the compiler version?
Comment 4 Botond Dénes 2020-08-14 14:46:17 UTC
Fedora32

$ gcc --version
gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1)

$ gdb --version
GNU gdb (GDB) Fedora 9.1-5.fc32

https://github.com/scylladb/seastar

$ git log -1 --oneline
1a4b3eb3 (HEAD -> master) sstring: mark str() and methods using it as noexcept

On Fedora 32 you can compile seastar by doing:
$ ./install-dependencies.sh
$ ./configure.py
$ ninja -c build/release

To build the application:
$ g++ exp.cpp $(pkg-config --libs --cflags --static /path/to/seastar/build/release/seastar.pc) -O0 -g -std=c++20 -Wfatal-errors -o exp

-std=c++20 is not needed, seastar supports c++17 as well.
Comment 5 Tom Tromey 2022-04-02 16:33:17 UTC
See also the green thread discussion
https://sourceware.org/pipermail/gdb/2022-March/049959.html