Created attachment 12168 [details] An example of the bug and 2 illustrations $$$$$$$$$$$$$$$$$ $$ Environment $$__________________________________________________________ $$$$$$$$$$$$$$$$$ GDB version: 10.0.50.20200103-git Branch : master Head commit: b26a3d5 (Fix ld/PR25316 for the ia64 target by ...) Impacted : Since GDB 9.0 (does not happen in GDB 8.3.1) Host OS : Archlinux with kernel 4.19.92-1-lts (x86_64) Target : aarch64 (this problem can happen to ANY baremetal target) GDB server : QEMU's GDB stub Configure : --prefix=/path/to/install \ --target=aarch64-linux-gnu \ --disable-binutils \ --disable-ld \ --disable-gold \ --disable-gas \ --disable-sim \ --disable-grpof \ --disable-werror \ CXXFLAGS='-O0 -g3 -fpermissive -fvar-tracking-assignments' \ CFLAGS='-O0 -g3 -fvar-tracking-assignments' $$$$$$$$$$$$$$$$$$$$$$$$ $$ Steps to reproduce $$___________________________________________________ $$$$$$$$$$$$$$$$$$$$$$$$ To reproduce the problem, use the attached "tiny_example": 0) Make sure you have aarch64-linux toolchain and qemu-aarch64 You can try ANY other target as long as you adapt "tiny.s" 1) Edit the "config" section of the "makefile" to reflect your settings 2) Execute the following commands in 2 separate terminals > make stub # spawns qemu listening for gdb > make debug # runs gdb and attaches to qemu $$$$$$$$$$$$$$$$$$$$$$$$ $$ Problem (Analysis) $$___________________________________________________ $$$$$$$$$$$$$$$$$$$$$$$$ There is a problem in how GDB's assembly TUI handles baremetal programs comprise of only a few lines of assembly instructions. This problem leads to a segmentation fault in GDB. See the attached "gdb-lay-asm-crash.gif" for an illustration of this issue. However, working in source TUI leads to no trouble (see "gdb-lay-src-ok.gif"). But, where does this happen? It happens in gdb/tui/tui-disasm.c: 1 tui_disasm_window::addr_is_displayed (...) 2 { 3 ... 4 while (i < content.size () - threshold && !is_displayed) 5 { 6 is_displayed 7 = (content[i].line_or_addr.loa == LOA_ADDRESS 8 ... 9 } At line 7, "content[i]" is the cause of the crash. "i" is 0 but "content" has no elements. In other words, "content.size()" returns 0. At line 4, "threshold" is 2 that leads to an overflow in condition check of "i < content.size() - threshold". Else, the loop wouldn't be entered. Why "content.size ()" is 0 you may ask. That is because of this piece of code again in gdb/tui/tui-disasm.c: 1 tui_disasm_window::set_contents (...) { 2 ... 3 /* Window size, excluding highlight box. */ 4 max_lines = height - 2; 5 line_width = width - TUI_EXECINFO_SIZE - 2; 6 /* Get temporary table that will hold all strings (addr & insn). */ 7 std::vector<tui_asm_line> asm_lines (max_lines); 8 size_t addr_size = 0; 9 tui_disassemble (gdbarch, asm_lines, pc, 0, max_lines, &addr_size); 10 ... 11 content.resize (max_lines); 12 for (i = 0; i < max_lines; i++) { 13 tui_source_element *src = &content[i]; 14 ... 15 } At line 9, "tui_disassemble()" throws an exception without letting "content" to be filled at all. This exception is also the reason that the message "Cannot access memory at address ..." is printed in GDB (see attached GIF files). The exception is rooted in gdb/disasm.c: 1 gdb_disassembler::print_insn (CORE_ADDR memaddr, ...) { 2 m_err_memaddr = 0; 3 4 int length = gdbarch_print_insn (arch (), memaddr, &m_di); 5 6 if (length < 0) 7 memory_error (TARGET_XFER_E_IO, m_err_memaddr); 8 ... 9 return length; 10 } At line 4, "gdbarch_print_insn()" returns -1 which leads to execution of "memory_error()" at line 7, a.k.a. the infamous exception. "gdb_print_insn()" returns -1 because it is dealt an invalid PC address that it cannot fetch. This PC address is _just_ after the last instruction of the inferior program. ########################################################################### All of these happens because in "tui_disasm_window::set_contents()", the variable "max_lines" instructs "tui_disassemble()" to decode more than it should have. ########################################################################### Under some circumstances, GDB will ask QEMU again to fetch the invalid addresses. In this case, QEMU returns a stream of 0's. Although the output of decoding would be garbage, but GDB won't crash. $$$$$$$$$$$$$$$ $$ Solution? $$____________________________________________________________ $$$$$$$$$$$$$$$ There are 2 things to tackle here: 1. The overflow occuring in "tui_disasm_window::addr_is_displayed". That must be easily avoidable by converting "content.size() - threshold" to a signed result explicitly. I will soon submit this patch: diff --git a/gdb/tui/tui-disasm.c b/gdb/tui/tui-disasm.c index c72b50730b0..7a3b32696cd 100644 --- a/gdb/tui/tui-disasm.c +++ b/gdb/tui/tui-disasm.c @@ -349,10 +349,10 @@ bool tui_disasm_window::addr_is_displayed (CORE_ADDR addr) const { bool is_displayed = false; - int threshold = SCROLL_THRESHOLD; + int nr_of_lines = int(content.size()) - int(SCROLL_THRESHOLD); int i = 0; - while (i < content.size () - threshold && !is_displayed) + while (i < nr_of_lines && !is_displayed) { is_displayed = (content[i].line_or_addr.loa == LOA_ADDRESS 2. Calculation of "max_lines" in "tui_disasm_window::set_contents". Ideally, "max_lines" should be: max_lines = std::min<int>(height-2, number_of_instructions_in_elf); However, it's not trivial (to me) how to get the number of instructions that exist there. Any thought on that?
The master branch has been updated by Pedro Alves <palves@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=cbfa85811792ca8e96ace40bef0aaaf507e54bcc commit cbfa85811792ca8e96ace40bef0aaaf507e54bcc Author: Shahab Vahedi <shahab@synopsys.com> Date: Mon Jan 6 15:27:32 2020 +0100 GDB: Fix the overflow in addr/line_is_displayed() In tui_disasm_window::addr_is_displayed(), there can be situations where "content" is empty. For instance, it can happen when the "content" was not filled in tui_disasm_window::set_contents(), because tui_disassemble() threw an exception. Usually this exception is the result of fetching invalid PC addresses like the ones beyond the end of the program. Having "content.size ()" zero leads to an overflow in this condition check inside tui_disasm_window::addr_is_displayed(): int i = 0; while (i < content.size () - threshold ...) { ... content[i] ... } "threshold" is 2 and there are times that "content.size ()" is 0. This results into an overflow and the loop is entered whereas it should have been skipped. Finally, "content[i]" access leads to a segmentation fault. Same problem applies to tui_source_window::line_is_displayed(). The issue has been discussed at length in bug 25345: https://sourceware.org/bugzilla/show_bug.cgi?id=25345 This commit avoids the segmentation faults with an early check: if (content.size () < SCROLL_THRESHOLD) return false; Moreover, those functions have been overhauled to a leaner code. gdb/ChangeLog: 2020-01-06 Shahab Vahedi <shahab@synopsys.com> * tui/tui-disasm.c (tui_disasm_window::addr_is_displayed): Avoid overflow by an early check of content vs threshold. * tui/tui-source.c (tui_source_window::line_is_displayed): Likewise.
Created attachment 12175 [details] GDB crash after reaching end of line.
Created attachment 12176 [details] GDB after the fix
The fix for point 1 is in. I also implemented a solution for point 2 as suggested by Andrew: diff --git a/gdb/tui/tui-disasm.c b/gdb/tui/tui-disasm.c index 98c691f3387..7faaa45f039 100644 --- a/gdb/tui/tui-disasm.c +++ b/gdb/tui/tui-disasm.c @@ -226,7 +226,18 @@ tui_disasm_window::set_contents (struct gdbarch *arch, /* Get temporary table that will hold all strings (addr & insn). */ std::vector<tui_asm_line> asm_lines (max_lines); size_t addr_size = 0; - tui_disassemble (gdbarch, asm_lines, pc, 0, max_lines, &addr_size); + try + { + tui_disassemble (gdbarch, asm_lines, pc, 0, max_lines, &addr_size); + } + catch (const gdb_exception &except) + { + /* In cases where max_lines is asking tui_disassemble() to fetch + too much, like when PC goes past the valid address range, a + MEMORY_ERROR is thrown, but it is alright. */ + if (except.error != MEMORY_ERROR) + throw; + } /* Align instructions to the same column. */ insn_pos = (1 + (addr_size / tab_len)) * tab_len; It does work (see attachments: "GDB crash after reaching end of line" vs "GDB after the fix"). However, it moves on to another issue. GDB terminates when it reaches a state where there is no valid line in TUI anymore (as "GDB after the fix" illustrates).
seems related to https://sourceware.org/bugzilla/show_bug.cgi?id=9765
Created attachment 12178 [details] Fix 2: no highlights, no crash. This fix comes from the following changes: diff --git a/gdb/tui/tui-disasm.c b/gdb/tui/tui-disasm.c index 98c691f3387..3dfe70bc480 100644 --- a/gdb/tui/tui-disasm.c +++ b/gdb/tui/tui-disasm.c @@ -114,7 +114,18 @@ tui_disassemble (struct gdbarch *gdbarch, asm_lines[pos + i].addr_size = new_size; } - pc = pc + gdb_print_insn (gdbarch, pc, &gdb_dis_out, NULL); + try + { + pc = pc + gdb_print_insn (gdbarch, pc, &gdb_dis_out, NULL); + } + catch (const gdb_exception &except) + { + /* In cases where max_lines is asking tui_disassemble() to fetch + too much, like when PC goes past the valid address range, a + MEMORY_ERROR is thrown, but it is alright. */ + if (except.error != MEMORY_ERROR) + throw; + } asm_lines[pos + i].insn = std::move (gdb_dis_out.string ()); It does not highlight the empty lines anymore or crash when goes one window beyond the last instructions. Moreover, going to the top is fine too. The only downside is that last invalid address (the address just after the _last_ valid address) is repeated. If I try to remove it, the highlight issue comes back.
another attempt: https://sourceware.org/ml/gdb-patches/2020-01/msg00354.html
Created attachment 12199 [details] A simple x86_64 program to open under GDB
Created attachment 12200 [details] Going up when in garbage bytes
Created attachment 12201 [details] Using PageUp when reached EOF
Created attachment 12202 [details] ArrowUp near the begining
The fixes are in. gdb/tui: Prevent exceptions from trying to cross readline https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=2f267673f0fdee9287e6d404ecd4f2d29da0d2f2 gdb/tui: asm window handles invalid memory and scrolls better https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=733d0a679536628eb1be4b4b8aa6384de24ff1f1 This issue can be marked as resolved now.
yet one tiny issue: https://sourceware.org/ml/gdb-patches/2020-01/msg00780.html
This patch: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=42330a681af23a80d3e1f201d2a65886875e74bd Covers all the corner cases. This issue is completely resolved for good.