Bug 20948 - --write option to GDB causes segmentation fault
Summary: --write option to GDB causes segmentation fault
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 8.3
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 23256 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-12-07 20:53 UTC by Catherine Moore
Modified: 2018-09-24 12:23 UTC (History)
9 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Full reproduction example (170 bytes, text/plain)
2017-05-31 12:20 UTC, jbm
Details
gdb-write.patch (486 bytes, patch)
2018-09-11 22:47 UTC, Jozef Lawrynowicz
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Catherine Moore 2016-12-07 20:53:12 UTC
A segmentation fault occurs when opening an executable file for writing.  Although this problem was encountered with a mips-sde-elf target, it can be reproduced for other targets.  It can also be reproduced using the command "set write on".

It looks like the segmentation fault was introduced with the addition of compressed debug sections in bfd probably because it causes extra fiddling with the strtab during the writing of an executable.

To reproduce:

$ cat null.c
int
main ()
{
}

$mips-sde-elf-gcc null.o -e main -o null.x

$mips-sde-elf-gdb -quiet --write null.x
Reading symbols from /scratch/cmoore/2016.11-mips-elf/test/null.x...(no debugging symbols found)...done.
(gdb) quit
Segmentation fault (core dumped)


 gdb -quiet ./mips-sde-elf-gdb
Reading symbols from ./mips-sde-elf-gdb...done.
(gdb) set prompt (top)
(top) run --write -quiet ./null.x
Starting program: /scratch/cmoore/2016.11-mips-elf/test/mips-sde-elf-gdb --write -quiet ./null.x
Reading symbols from /scratch/cmoore/2016.11-mips-elf/test/null.x...(no debugging symbols found)...done.
(gdb) quit

Program received signal SIGSEGV, Segmentation fault.
0x0000000000783a52 in _bfd_elf_strtab_finalize (tab=0x0) at /scratch/cmoore/2016.11-mips-elf/obj/gdb-src-2016.11-999999-mips-sde-elf-x86_64-linux-gnu/bfd/elf-strtab.c:341
341       amt = tab->size * sizeof (struct elf_strtab_hash_entry *);
(top) p tab
$1 = (struct elf_strtab_hash *) 0x0
(top) bt 2
#0  0x0000000000783a52 in _bfd_elf_strtab_finalize (tab=0x0) at /scratch/cmoore/2016.11-mips-elf/obj/gdb-src-2016.11-999999-mips-sde-elf-x86_64-linux-gnu/bfd/elf-strtab.c:341
#1  0x0000000000759c90 in _bfd_elf_assign_file_positions_for_non_load (abfd=0xe97770) at /scratch/cmoore/2016.11-mips-elf/obj/gdb-src-2016.11-999999-mips-sde-elf-x86_64-linux-gnu/bfd/elf.c:5840
(More stack frames follow...)
(top) up 1
#1  0x0000000000759c90 in _bfd_elf_assign_file_positions_for_non_load (abfd=0xe97770) at /scratch/cmoore/2016.11-mips-elf/obj/gdb-src-2016.11-999999-mips-sde-elf-x86_64-linux-gnu/bfd/elf.c:5840
5840      _bfd_elf_strtab_finalize (elf_shstrtab (abfd));
(top) p abfd->tdata.elf_obj_data->o->shstrtab_section
$4 = 0x0
(top)
Comment 1 HectorOron 2016-12-14 09:05:59 UTC
Note, at Debian we got same bug report:
  https://bugs.debian.org/835439
Comment 2 jbm 2017-05-31 12:20:36 UTC
Created attachment 10078 [details]
Full reproduction example
Comment 3 jbm 2017-05-31 12:22:13 UTC
Comment on attachment 10078 [details]
Full reproduction example

This indeed seems to be a problem in libbfd. I was encountered the problem while using the library. 

The segfault seems to occur when opening a file for both reading and writing.

Minimal example:

 $ cat bfdbug.c
#include<bfd.h>

// Error checks omitted for clarity, see attachment
int main(int argc, char **argv) {
  bfd *abfd = bfd_fopen(argv[1], "default", "r+", -1);
  bfd_check_format(abfd, bfd_object);
  bfd_close(abfd);
}

 $ gcc bfdbug.c -g -lbfd -o bfdbug
 $ cp bfdbug test.elf
 $ gdb bfdbug
GNU gdb (Ubuntu 7.12.50.20170314-0ubuntu1) 7.12.50.20170314-git
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bfdbug...done.
(gdb) run test.elf
Starting program: /home/bachme/bt-cgp/cgp/bfdbug test.elf

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b1f391 in _bfd_elf_strtab_finalize () from /usr/lib/x86_64-linux-gnu/libbfd-2.28-system.so
(gdb) where
#0  0x00007ffff7b1f391 in _bfd_elf_strtab_finalize () from /usr/lib/x86_64-linux-gnu/libbfd-2.28-system.so
#1  0x00007ffff7b01a17 in _bfd_elf_write_object_contents () from /usr/lib/x86_64-linux-gnu/libbfd-2.28-system.so
#2  0x00007ffff7adff87 in bfd_close () from /usr/lib/x86_64-linux-gnu/libbfd-2.28-system.so
#3  0x0000555555554856 in main (argc=2, argv=0x7fffffffe588) at bfdbug.c:7
(gdb)

When using "w" or "r" instead of "r+" as the mode argument to bfd_fopen (or using bfd_openr/bfd_openw instead), no error ocurrs (even when adding symbols to the symtab / strtab) before closing.

Binutils version:
 $ objdump -V
GNU objdump (GNU Binutils for Ubuntu) 2.28
Copyright (C) 2017 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
Comment 4 jbm 2017-06-01 10:14:34 UTC
This is a null-ptr dereference in bfd/elf-strtab.c:367, resulting from a null-ptr stringtab extracted in bfd/elf.c:6272.

   elf_shstrtab(abfd) == abfd->tdata.elf_obj_data->o->strtab_ptr == NULL

I don't know any BFD internals, so I sadly can't investigate this further.
Comment 5 Blauhirn 2017-11-03 19:22:43 UTC
For the real desperate like me, here is a bash workaround.

gdb_write() {
    local x=$(mktemp)
    echo "set prompt (gdb-wrapper)
    handle SIGINT noprint pass
    break bfd_close
    commands
        return (int) 1
        continue
    end
    run -q --write $@
    quit" >|$x
    gdb -q gdb -x $x
    rm $x
}

Call like `gdb_write myprogram`.
This makes gdb ignore the code inside `bfd_close`. I have no idea what damage this might inflict, but the `--write` option does its job this way.
Comment 6 Jozef Lawrynowicz 2018-03-08 22:27:44 UTC
Proposed patch in https://sourceware.org/ml/gdb-patches/2018-03/msg00172.html
Comment 7 Pedro Alves 2018-06-04 10:59:21 UTC
*** Bug 23256 has been marked as a duplicate of this bug. ***
Comment 8 Jozef Lawrynowicz 2018-06-04 18:47:54 UTC
My revised patch which fixes the core BFD issue is here: https://sourceware.org/ml/binutils/2018-03/msg00228.html
Comment 9 Pedro Alves 2018-06-05 10:28:02 UTC
Is the copyright issue resolved?  The patch is too big to go in without an assignment.
Comment 10 Jozef Lawrynowicz 2018-06-06 10:11:50 UTC
On 05/06/18 11:28, palves at redhat dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=20948
>
> Pedro Alves <palves at redhat dot com> changed:
>
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                   CC|                            |palves at redhat dot com
>
> --- Comment #9 from Pedro Alves <palves at redhat dot com> ---
> Is the copyright issue resolved?  The patch is too big to go in without an
> assignment.
>
Yes it's resolved, I've got a copyright assignment on file for Binutils 
and GDB now.
Comment 11 Stefan Markovic 2018-09-05 14:47:20 UTC
Hi Jozef,

(In reply to Jozef Lawrynowicz from comment #8)
> My revised patch which fixes the core BFD issue is here:
> https://sourceware.org/ml/binutils/2018-03/msg00228.html

When I apply this patch (to freshly cloned GDB master branch - latest commit is from Aug 29th), I still have some issues. Here's the detailed report:

$cat test.c
int main(int argc, char const *argv[])
{
	/* code */
	return 0;
}

$gcc -g test.c

$./buildGDB/gdb/gdb -write a.out

GNU gdb (GDB) 8.2.50.20180829-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from a.out...done.
(gdb) q

Here, gdb exits normally. BUT, when I pass same a.out file to gdb again(with or without -write option)  I get:

Reading symbols from a.out...BFD: a.out: invalid string offset 1953392942 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 778400879 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 771778675 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1936876918 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1634493810 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1702112768 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1634887263 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1633645673 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1680736370 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1633955328 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1600615778 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1650811950 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1680736357 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 4194872 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 596 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 4194968 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 696 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 4195128 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 832 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 4195192 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 936 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 4195328 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1396 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 4195716 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 1464 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 6295064 >= 548 for section `.strtab'
BFD: a.out: invalid string offset 3616 >= 548 for section `.strtab'
done.

Also, if I pass .o file to gdb with built with Your patch applied, I still get segmentation fault and few assertions fails:

$gcc -g test.c -c

$./buildGDB/gdb/gdb -write -quiet test.o
Reading symbols from test.o...done.
(gdb) q
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf.c:5766
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf.c:5766
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf.c:5766
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf.c:5766
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf-strtab.c:280
Segmentation fault (core dumped)

$gdb buildGDB/gdb/gdb
(gdb) set prompt (TOP_GDB) 
(TOP_GDB) run -write test.o
(gdb) q
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf.c:5766
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf.c:5766
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf.c:5766
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf.c:5766
BFD: BFD (GNU Binutils) 2.31.51.20180829 assertion fail ../../bfd/elf-strtab.c:280

Program received signal SIGSEGV, Segmentation fault.
_bfd_elf_strtab_offset (tab=0xfcede0, idx=44) at ../../bfd/elf-strtab.c:283
283	  BFD_ASSERT (entry->refcount > 0);
(TOP GDB) p entry->refcount 
Cannot access memory at address 0x6924000064726f93
Comment 12 Stefan Markovic 2018-09-06 09:18:24 UTC
I forgot to mention that entire toolchain that I'm using is built with Your bfd changdes, not just GDB.
Comment 13 Jozef Lawrynowicz 2018-09-07 18:02:04 UTC
Thanks for the info.
I wasn't able to reproduce your issue on x86_64-pc-linux-gnu.

gcc -g empty.c
gdb -q --write -ex q a.out
Reading symbols from a.out...done.
gdb -q --write -ex q a.out
Reading symbols from a.out...done.

No assertion fails for me.
However, if I compare the "readelf -a" ouput of a.out, before and after
invoking GDB, there is clearly an issue.

The following segments are missing from a.out after invoking gdb with --write:
06     .dynamic                             
07     .note.ABI-tag                        
08     .eh_frame_hdr                        
09                                          
10     .init_array .fini_array .dynamic .got

Could you post the "gcc -v" output, here's mine:
        ...
        Target: x86_64-pc-linux-gnu
        Configured with: ../configure --enable-languages=c,c++ --disable-nls
        Thread model: posix
        gcc version 8.1.0 (GCC) 

Could you also attach the linked executable you get from "gcc -g test.c".

---

I was actually able to reproduce the assertion failure at elf.c:5766 when
building test.c for the msp430-elf target.

The assertion fails because hdr->bfd_section->filepos has been
changed since opening the BFD, but hdr->sh_offset has not. However, another
copy of the section header for this section did have sh_offset updated to the
correct value.

The issue is that there are two copies of the Elf_Internal_Shdr for each
section. One is accessible via "elf_section_data(sec)->this_hdr", the other in
"elf_elfsections(abfd)" (you have to iterate to find the correct header for the
section first).

When linking files, these headers have the same address, i.e. the
"elf_elfsections(abfd)" headers point to the corresponding
"elf_section_data(sec)->this_hdr". But when opening a BFD for update, these
have different addresses, so updating the values in one does not update the
other.

---

Regarding object files, the documentation (https://sourceware.org/gdb/onlinedocs/gdb/Patching.html#Patching)
says that the --write is for executable and core files only.
So I would say trying to use --write with a relocatable object file is maybe
not supposed to work, but at this point I don't have an opinion on whether
it fundamentally can or can't work.
After the issues with using --write on executable files is
fixed, if there are still problems with object files, I will take a look.
Comment 14 Jozef Lawrynowicz 2018-09-10 11:11:13 UTC
Ok, I reproduced the "invalid string offset" errors.

Empty sections cause problems when re-opening a linked executable as their file offset is not usually the same as what it would be if they had size > 0. In my case, multiple empty, but not SHT_NOBITS, sections shared the same file offset after linking.
This results in the section to segment mapping changing after an invocation of --write.

The reason this causes these errors is because the most of the actual contents of the file have not been updated to reflect changes in the metadata. For example, if the segment map changes so that there is an additional segment, then the file offsets of all the sections will increase as there is extra data in the program header. But the contents of the executable file have not been moved to reflect the changed offsets, which results in the symbol/string table looking a mess.

So we need to "slurp" the contents of the BFD when it is first opened, and then write it back at the end.

Then there is the question of whether the executable file should should be allowed to change after it is opened for update, but no explicit modifications are made. The BFD library code uses the section headers to rebuild the metadata such as the segment mapping. Perhaps when opening an executable for update it should use the program headers to ensure that the layout of the executable does not change.

After opening an executable for update, then closing it without making any changes, I would have expected the ELF data in the before and after executables to be identical, but maybe this is not how it's supposed to work.
Comment 15 Jozef Lawrynowicz 2018-09-11 22:47:56 UTC
Created attachment 11237 [details]
gdb-write.patch

Further investigation into what can actually be modified when a BFD is
opened for update, revealed that the fix for this is quite simple.

When a BFD is opened for update (i.e. with "r+" flags, or indirectly via
"gdb --write"), abfd->output_has_begun is set to TRUE, which heavily restricts
what you can do to the BFD.

- New sections cannot be created
- The size of sections cannot be modified

This means the ELF header, program headers and sections headers will not
change, so they do not need to be written in _bfd_elf_write_object_contents,
as the original versions already exist in the BFD.

Even though the patch modifies write_object_contents to exit before section
contents are written to the output BFD, any changes to contents made using
bfd_set_section_contents or by setting memory in gdb --write have already been
written, so are present in the executable after closing the BFD.

The patch is attached as gdb-write.patch.

I tested it on current binutils-gdb master for x86_64-pc-linux-gnu and
msp430-elf in a couple of ways.
- Build the attached "modify-bfd.c", and run it with an ELF file as an
  argument, check the string FOOBAR appears in the .comment section.
- Run "gdb --write -batch -ex "set {int}main = 0x1234" FILE", then check the
  contents of memory at main are 0x1234
- Run "gdb --write -ex q FILE", verify the before and after versions of FILE
  are identical.

I used the set of resulting executable and object files left over from the
linker testsuite as input.
Comment 16 Stefan Markovic 2018-09-12 13:10:23 UTC
(In reply to Jozef Lawrynowicz from comment #15)
> Created attachment 11237 [details]
> gdb-write.patch
> 
> Further investigation into what can actually be modified when a BFD is
> opened for update, revealed that the fix for this is quite simple.
> 
> When a BFD is opened for update (i.e. with "r+" flags, or indirectly via
> "gdb --write"), abfd->output_has_begun is set to TRUE, which heavily
> restricts
> what you can do to the BFD.
> 
> - New sections cannot be created
> - The size of sections cannot be modified
> 
> This means the ELF header, program headers and sections headers will not
> change, so they do not need to be written in _bfd_elf_write_object_contents,
> as the original versions already exist in the BFD.
> 
> Even though the patch modifies write_object_contents to exit before section
> contents are written to the output BFD, any changes to contents made using
> bfd_set_section_contents or by setting memory in gdb --write have already
> been
> written, so are present in the executable after closing the BFD.
> 
> The patch is attached as gdb-write.patch.
> 
> I tested it on current binutils-gdb master for x86_64-pc-linux-gnu and
> msp430-elf in a couple of ways.
> - Build the attached "modify-bfd.c", and run it with an ELF file as an
>   argument, check the string FOOBAR appears in the .comment section.
> - Run "gdb --write -batch -ex "set {int}main = 0x1234" FILE", then check the
>   contents of memory at main are 0x1234
> - Run "gdb --write -ex q FILE", verify the before and after versions of FILE
>   are identical.
> 
> I used the set of resulting executable and object files left over from the
> linker testsuite as input.

This makes sense and it works.

I have also tested this patch for x86_64 on master binutils-gdb and for nanomips-elf (without any other of Your previous patches applied) and I can confirm that this one entirely solves this issue. Even with relocatable object files regardless whether that is supposed to work or not.

Thank You for Your time and help.
Comment 17 Jozef Lawrynowicz 2018-09-13 10:57:08 UTC
Thanks for testing, I've submitted it here https://sourceware.org/ml/binutils/2018-09/msg00107.html
Comment 18 Sourceware Commits 2018-09-24 12:23:06 UTC
The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=db72737006fc383cb8838bf7f3dc8e641e60c38f

commit db72737006fc383cb8838bf7f3dc8e641e60c38f
Author: Jozef Lawrynowicz <jozef.l@mittosystems.com>
Date:   Tue Sep 11 22:56:36 2018 +0100

    Fix PR gdb/20948: --write option to GDB causes segmentation fault
    
    When opening a BFD for update, as gdb --write does, modifications to
    anything but the contents of sections is restricted.
    
    Do not try to write back any ELF headers in this case.
    
    bfd/ChangeLog
    2018-09-24  Jozef Lawrynowicz  <jozef.l@mittosystems.com>
    
    	PR gdb/20948
    	* elf.c (_bfd_elf_write_object_contents): Return from function
    	early if abfd->direction == both_direction.
    
    gdb/testsuite/ChangeLog
    2018-09-24  Jozef Lawrynowicz  <jozef.l@mittosystems.com>
    
    	PR gdb/20948
    	* gdb.base/write_mem.exp: New test.
    	* gdb.base/write_mem.c: Likewise.
Comment 19 Tom Tromey 2018-09-24 12:23:47 UTC
I believe the patch fixed this, so I am closing.