Bug 25345 - Tiny programs (few lines of assembly) cause GDB's TUI to crash
Summary: Tiny programs (few lines of assembly) cause GDB's TUI to crash
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: tui (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-04 11:26 UTC by Shahab
Modified: 2020-01-31 10:11 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
An example of the bug and 2 illustrations (761.21 KB, application/x-compressed-tar)
2020-01-04 11:26 UTC, Shahab
Details
GDB crash after reaching end of line. (53.99 KB, image/gif)
2020-01-08 15:43 UTC, Shahab
Details
GDB after the fix (94.48 KB, image/gif)
2020-01-08 15:43 UTC, Shahab
Details
Fix 2: no highlights, no crash. (732.28 KB, image/gif)
2020-01-09 11:47 UTC, Shahab
Details
A simple x86_64 program to open under GDB (12.31 KB, application/x-compressed-tar)
2020-01-14 13:34 UTC, Shahab
Details
Going up when in garbage bytes (753.53 KB, image/gif)
2020-01-14 13:36 UTC, Shahab
Details
Using PageUp when reached EOF (875.64 KB, image/gif)
2020-01-14 13:39 UTC, Shahab
Details
ArrowUp near the begining (1.71 MB, image/gif)
2020-01-14 13:40 UTC, Shahab
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Shahab 2020-01-04 11:26:16 UTC
Created attachment 12168 [details]
An example of the bug and 2 illustrations

$$$$$$$$$$$$$$$$$
$$ Environment $$__________________________________________________________
$$$$$$$$$$$$$$$$$

GDB version: 10.0.50.20200103-git
Branch     : master
Head commit: b26a3d5 (Fix ld/PR25316 for the ia64 target by ...)
Impacted   : Since GDB 9.0 (does not happen in GDB 8.3.1)

Host OS    : Archlinux with kernel 4.19.92-1-lts (x86_64)
Target     : aarch64 (this problem can happen to ANY baremetal target)
GDB server : QEMU's GDB stub
Configure  : --prefix=/path/to/install                                  \
             --target=aarch64-linux-gnu                                 \
             --disable-binutils                                         \
             --disable-ld                                               \
             --disable-gold                                             \
             --disable-gas                                              \
             --disable-sim                                              \
             --disable-grpof                                            \
             --disable-werror                                           \
             CXXFLAGS='-O0 -g3 -fpermissive -fvar-tracking-assignments' \
             CFLAGS='-O0 -g3 -fvar-tracking-assignments'

$$$$$$$$$$$$$$$$$$$$$$$$
$$ Steps to reproduce $$___________________________________________________
$$$$$$$$$$$$$$$$$$$$$$$$

To reproduce the problem, use the attached "tiny_example":

0) Make sure you have aarch64-linux toolchain and qemu-aarch64
   You can try ANY other target as long as you adapt "tiny.s"

1) Edit the "config" section of the "makefile" to reflect your settings

2) Execute the following commands in 2 separate terminals
   > make stub                  # spawns qemu listening for gdb
   > make debug                 # runs gdb and attaches to qemu

$$$$$$$$$$$$$$$$$$$$$$$$
$$ Problem (Analysis) $$___________________________________________________
$$$$$$$$$$$$$$$$$$$$$$$$

There is a problem in how  GDB's assembly TUI  handles baremetal programs
comprise of only a few lines of assembly instructions. This problem leads
to a segmentation fault in GDB.  See the attached "gdb-lay-asm-crash.gif"
for an illustration of this issue.  However, working in source TUI  leads
to no trouble (see "gdb-lay-src-ok.gif").

But, where does this happen? It happens in gdb/tui/tui-disasm.c:
 1 tui_disasm_window::addr_is_displayed (...)
 2 {
 3   ...
 4   while (i < content.size () - threshold && !is_displayed)
 5     {
 6       is_displayed
 7         = (content[i].line_or_addr.loa == LOA_ADDRESS
 8   ...
 9 }
At line 7, "content[i]" is the cause of the crash. "i" is 0 but "content"
has no elements. In other words, "content.size()" returns 0.
At line 4, "threshold" is 2 that leads to an overflow  in condition check
of "i < content.size() - threshold". Else, the loop wouldn't be entered.

Why "content.size ()" is 0 you may ask. That is because of this  piece of
code again in gdb/tui/tui-disasm.c:
 1 tui_disasm_window::set_contents (...) {
 2   ...
 3   /* Window size, excluding highlight box.  */
 4   max_lines = height - 2;
 5   line_width = width - TUI_EXECINFO_SIZE - 2;

 6   /* Get temporary table that will hold all strings (addr & insn).  */
 7   std::vector<tui_asm_line> asm_lines (max_lines);
 8   size_t addr_size = 0;
 9   tui_disassemble (gdbarch, asm_lines, pc, 0, max_lines, &addr_size);
10   ...
11   content.resize (max_lines);
12   for (i = 0; i < max_lines; i++) {
13       tui_source_element *src = &content[i];
14   ...
15 }
At line  9,  "tui_disassemble()"  throws  an  exception  without  letting
"content" to be filled at all. This exception is also the reason that the
message  "Cannot access memory at address ..."  is  printed  in GDB  (see
attached GIF files).

The exception is rooted in gdb/disasm.c:
 1 gdb_disassembler::print_insn (CORE_ADDR memaddr, ...) {
 2   m_err_memaddr = 0;
 3 
 4   int length = gdbarch_print_insn (arch (), memaddr, &m_di);
 5 
 6   if (length < 0)
 7     memory_error (TARGET_XFER_E_IO, m_err_memaddr);
 8   ...
 9   return length;
10 }
At line 4, "gdbarch_print_insn()" returns -1 which leads to execution  of
"memory_error()" at line 7, a.k.a. the infamous exception.

"gdb_print_insn()" returns -1 because it is dealt  an  invalid PC address
that  it  cannot fetch.  This  PC  address  is  _just_   after  the  last
instruction of the inferior program.

###########################################################################
 All of these happens because in "tui_disasm_window::set_contents()", the
 variable "max_lines"  instructs  "tui_disassemble()" to decode more than
 it should have.
###########################################################################

Under some circumstances, GDB will ask QEMU again to  fetch  the  invalid
addresses.  In this case,  QEMU  returns  a  stream of 0's.  Although the
output of decoding would be garbage, but GDB won't crash.


$$$$$$$$$$$$$$$
$$ Solution? $$____________________________________________________________
$$$$$$$$$$$$$$$

There are 2 things to tackle here:

1. The overflow occuring in "tui_disasm_window::addr_is_displayed".  That
   must be easily avoidable by converting "content.size() - threshold" to
   a signed result explicitly. I will soon submit this patch:
     
     diff --git a/gdb/tui/tui-disasm.c b/gdb/tui/tui-disasm.c
     index c72b50730b0..7a3b32696cd 100644
     --- a/gdb/tui/tui-disasm.c
     +++ b/gdb/tui/tui-disasm.c
     @@ -349,10 +349,10 @@ bool
      tui_disasm_window::addr_is_displayed (CORE_ADDR addr) const
      {
        bool is_displayed = false;
     -  int threshold = SCROLL_THRESHOLD;
     +  int nr_of_lines = int(content.size()) - int(SCROLL_THRESHOLD);
      
        int i = 0;
     -  while (i < content.size () - threshold && !is_displayed)
     +  while (i < nr_of_lines && !is_displayed)
          {
            is_displayed
      	= (content[i].line_or_addr.loa == LOA_ADDRESS

2. Calculation of "max_lines" in "tui_disasm_window::set_contents".
   Ideally, "max_lines" should be:
     max_lines = std::min<int>(height-2, number_of_instructions_in_elf);

However, it's not trivial (to me) how to get  the  number of instructions
that exist there. Any thought on that?
Comment 1 Sourceware Commits 2020-01-06 19:51:48 UTC
The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=cbfa85811792ca8e96ace40bef0aaaf507e54bcc

commit cbfa85811792ca8e96ace40bef0aaaf507e54bcc
Author: Shahab Vahedi <shahab@synopsys.com>
Date:   Mon Jan 6 15:27:32 2020 +0100

    GDB: Fix the overflow in addr/line_is_displayed()
    
    In tui_disasm_window::addr_is_displayed(), there can be situations
    where "content" is empty. For instance, it can happen when the
    "content" was not filled in tui_disasm_window::set_contents(),
    because tui_disassemble() threw an exception. Usually this exception
    is the result of fetching invalid PC addresses like the ones beyond
    the end of the program.
    
    Having "content.size ()" zero leads to an overflow in this condition
    check inside tui_disasm_window::addr_is_displayed():
    
      int i = 0;
      while (i < content.size () - threshold ...) {
        ... content[i] ...
      }
    
    "threshold" is 2 and there are times that "content.size ()" is 0.
    This results into an overflow and the loop is entered whereas it
    should have been skipped. Finally, "content[i]" access leads to
    a segmentation fault.
    
    Same problem applies to tui_source_window::line_is_displayed().
    
    The issue has been discussed at length in bug 25345:
      https://sourceware.org/bugzilla/show_bug.cgi?id=25345
    
    This commit avoids the segmentation faults with an early check:
    
      if (content.size () < SCROLL_THRESHOLD)
        return false;
    
    Moreover, those functions have been overhauled to a leaner code.
    
    gdb/ChangeLog:
    2020-01-06  Shahab Vahedi  <shahab@synopsys.com>
    
    	* tui/tui-disasm.c (tui_disasm_window::addr_is_displayed): Avoid
    	overflow by an early check of content vs threshold.
            * tui/tui-source.c (tui_source_window::line_is_displayed):
    	Likewise.
Comment 2 Shahab 2020-01-08 15:43:15 UTC
Created attachment 12175 [details]
GDB crash after reaching end of line.
Comment 3 Shahab 2020-01-08 15:43:51 UTC
Created attachment 12176 [details]
GDB after the fix
Comment 4 Shahab 2020-01-08 15:54:55 UTC
The fix for point 1 is in. I also implemented a solution for point 2 as suggested by Andrew:

diff --git a/gdb/tui/tui-disasm.c b/gdb/tui/tui-disasm.c
index 98c691f3387..7faaa45f039 100644
--- a/gdb/tui/tui-disasm.c
+++ b/gdb/tui/tui-disasm.c
@@ -226,7 +226,18 @@ tui_disasm_window::set_contents (struct gdbarch *arch,
   /* Get temporary table that will hold all strings (addr & insn).  */
   std::vector<tui_asm_line> asm_lines (max_lines);
   size_t addr_size = 0;
-  tui_disassemble (gdbarch, asm_lines, pc, 0, max_lines, &addr_size);
+  try
+    {
+      tui_disassemble (gdbarch, asm_lines, pc, 0, max_lines, &addr_size);
+    }
+  catch (const gdb_exception &except)
+    {
+      /* In cases where max_lines is asking tui_disassemble() to fetch
+	 too much, like when PC goes past the valid address range, a
+	 MEMORY_ERROR is thrown, but it is alright.  */
+      if (except.error != MEMORY_ERROR)
+	  throw;
+    }
 
   /* Align instructions to the same column.  */
   insn_pos = (1 + (addr_size / tab_len)) * tab_len;


It does work (see attachments: "GDB crash after reaching end of line" vs "GDB after the fix"). However, it moves on to another issue. GDB terminates when it reaches a state where there is no valid line in TUI anymore (as "GDB after the fix" illustrates).
Comment 5 Shahab 2020-01-08 16:23:03 UTC
seems related to https://sourceware.org/bugzilla/show_bug.cgi?id=9765
Comment 6 Shahab 2020-01-09 11:47:23 UTC
Created attachment 12178 [details]
Fix 2: no highlights, no crash.

This fix comes from the following changes:

diff --git a/gdb/tui/tui-disasm.c b/gdb/tui/tui-disasm.c
index 98c691f3387..3dfe70bc480 100644
--- a/gdb/tui/tui-disasm.c
+++ b/gdb/tui/tui-disasm.c
@@ -114,7 +114,18 @@ tui_disassemble (struct gdbarch *gdbarch,
 	  asm_lines[pos + i].addr_size = new_size;
 	}
 
-      pc = pc + gdb_print_insn (gdbarch, pc, &gdb_dis_out, NULL);
+      try
+	{
+	  pc = pc + gdb_print_insn (gdbarch, pc, &gdb_dis_out, NULL);
+	}
+      catch (const gdb_exception &except)
+	{
+	  /* In cases where max_lines is asking tui_disassemble() to fetch
+	     too much, like when PC goes past the valid address range, a
+	     MEMORY_ERROR is thrown, but it is alright.  */
+	  if (except.error != MEMORY_ERROR)
+	    throw;
+	}
 
       asm_lines[pos + i].insn = std::move (gdb_dis_out.string ());


It does not highlight the empty lines anymore or crash when goes one window beyond the last instructions. Moreover, going to the top is fine too.

The only downside is that last invalid address (the address just after the _last_ valid address) is repeated. If I try to remove it, the highlight issue comes back.
Comment 7 Shahab 2020-01-14 13:19:35 UTC
another attempt:

https://sourceware.org/ml/gdb-patches/2020-01/msg00354.html
Comment 8 Shahab 2020-01-14 13:34:56 UTC
Created attachment 12199 [details]
A simple x86_64 program to open under GDB
Comment 9 Shahab 2020-01-14 13:36:36 UTC
Created attachment 12200 [details]
Going up when in garbage bytes
Comment 10 Shahab 2020-01-14 13:39:22 UTC
Created attachment 12201 [details]
Using PageUp when reached EOF
Comment 11 Shahab 2020-01-14 13:40:53 UTC
Created attachment 12202 [details]
ArrowUp near the begining
Comment 12 Shahab 2020-01-24 09:27:26 UTC
The fixes are in.

gdb/tui: Prevent exceptions from trying to cross readline
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=2f267673f0fdee9287e6d404ecd4f2d29da0d2f2

gdb/tui: asm window handles invalid memory and scrolls better
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=733d0a679536628eb1be4b4b8aa6384de24ff1f1

This issue can be marked as resolved now.
Comment 13 Shahab 2020-01-24 11:14:55 UTC
yet one tiny issue:
https://sourceware.org/ml/gdb-patches/2020-01/msg00780.html
Comment 14 Shahab 2020-01-31 10:11:57 UTC
This patch:

https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=42330a681af23a80d3e1f201d2a65886875e74bd

Covers all the corner cases. This issue is completely resolved for good.