[PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections

Sun May 3 19:06:46 GMT 2020

On Sun, 3 May 2020 04:07:33 -0700
"H.J. Lu" <hjl.tools@gmail.com> wrote:

> On Sun, May 3, 2020 at 12:25 AM Kevin Buettner via Gdb-patches
> <gdb-patches@sourceware.org> wrote:
> >
> > On Sun, 29 Mar 2020 14:18:46 +0100
> > Pedro Alves <palves@redhat.com> wrote:
> >  
> > > Removing the bfd hack alone fixes your new test for me.
> > >  
> > > > But, due to the way that the target
> > > > strata are traversed when attempting to access memory, the
> > > > non-SEC_HAS_CONTENTS sections will be read as zeroes from the
> > > > process_stratum (which in this case is the core file stratum) without
> > > > first checking the file stratum, which is where the data might actually
> > > > be found.  
> > >
> > > I've applied your patch #1 only, and ran the corefile.exp test, but
> > > it still passes cleanly for me.  I don't see any "print coremaker_ro"
> > > FAIL here.  :-/  That makes it a bit harder for me to understand all
> > > of this.  I'm on Fedora 27.  
> >
> > I'm still working through the rest of your comments, but I have
> > figured out what's going on with Fedora 27, so I'll address that now.
> >
> > I've tested with Fedora 27, 28, 29, 31, and 32.  I am able to confirm
> > the lack of regression with only patch #1 applied using F27 and F28.
> > F29 onwards show the regression.  (I didn't test with F30, but I assume
> > that it too shows the regression.)
> >
> > I ended up using F28 and F29 to try to figure out what's going on.
> >
> > There's not much difference in the kernel versions:
> >
> > [kev@f28-1 gdb]$ uname -a
> > Linux f28-1 5.0.16-100.fc28.x86_64 #1 SMP Tue May 14 18:22:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> >
> > [kev@f29-efi-1 gdb]$ uname -a
> > Linux f29-efi-1 5.0.17-200.fc29.x86_64 #1 SMP Mon May 20 15:39:10 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> >
> > The gcc versions seem to be identical:
> >
> > [kev@f28-1 gdb]$ gcc --version
> > gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2)
> > Copyright (C) 2018 Free Software Foundation, Inc.
> > This is free software; see the source for copying conditions.  There is NO
> > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> >
> > [kev@f29-efi-1 gdb]$ gcc --version
> > gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2)
> > Copyright (C) 2018 Free Software Foundation, Inc.
> > This is free software; see the source for copying conditions.  There is NO
> > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> >
> > There is a slight difference in the binutils versions:
> >
> > [kev@f28-1 gdb]$ ld --version | head -1
> > GNU ld version 2.29.1-23.fc28
> >
> > [kev@f29-efi-1 gdb]$ ld --version | head -1
> > GNU ld version 2.31.1-25.fc29
> >
> > I wondered at first if there was some difference between the way that
> > F28 and F29 kernels made core dumps.  I ran the F28 binary on F29
> > and vice versa and found that this didn't make any difference.  The
> > results were the same.  I.e. the F28 binary failed to show the problem
> > even when run/dumped using F29.  Likewise, the F29 binary
> > showed the problem when run/dumped using F28.  So, it seemed likely
> > that the problem was intrinsic to the binary.
> >
> > Looking at the output of...
> >
> > readelf -a testsuite/outputs/gdb.base/corefile/corefile
> >
> > ...on F28 and F29 revealed the following:
> >
> > F28:
> >
> >   [Nr] Name              Type             Address           Offset
> >        Size              EntSize          Flags  Link  Info  Align
> > ...
> >   [11] .init             PROGBITS         00000000004005d0  000005d0
> >        0000000000000017  0000000000000000  AX       0     0     4
> >   [12] .plt              PROGBITS         00000000004005f0  000005f0
> >        00000000000000a0  0000000000000010  AX       0     0     16
> >   [13] .text             PROGBITS         0000000000400690  00000690
> >        0000000000000381  0000000000000000  AX       0     0     16
> >   [14] .fini             PROGBITS         0000000000400a14  00000a14
> >        0000000000000009  0000000000000000  AX       0     0     4
> >   [15] .rodata           PROGBITS         0000000000400a20  00000a20
> >        0000000000000067  0000000000000000   A       0     0     8
> >
> > F29:
> >
> >
> >   [Nr] Name              Type             Address           Offset
> >        Size              EntSize          Flags  Link  Info  Align
> > ...
> >   [11] .init             PROGBITS         0000000000401000  00001000
> >        000000000000001b  0000000000000000  AX       0     0     4
> >   [12] .plt              PROGBITS         0000000000401020  00001020
> >        00000000000000a0  0000000000000010  AX       0     0     16
> >   [13] .text             PROGBITS         00000000004010c0  000010c0
> >        0000000000000395  0000000000000000  AX       0     0     16
> >   [14] .fini             PROGBITS         0000000000401458  00001458
> >        000000000000000d  0000000000000000  AX       0     0     4
> >   [15] .rodata           PROGBITS         0000000000402000  00002000
> >        0000000000000067  0000000000000000   A       0     0     8
> >
> > The thing to observe here is that F28's .rodata address is 0x400a20.
> > Observe, too, that the addresses for .text and .fini aren't that far
> > away.
> >
> > The address for .rodata on F29 is at 0x402000.  It's aligned on a 4K
> > boundary which separates it quite a lot  from the sections preceding
> > it.
> >
> > Checking the kernel sources, I found that PAGE_SIZE and ELF_EXEC_PAGESIZE
> > are 4096 for the architecture in question.  (Actually most (maybe all?) have
> > this setting.)  These values are used to determine ELF_MIN_ALIGN in
> > fs/binfmt_elf.c.
> >
> > Moving onto the corefiles, I see:
> >
> > F28:
> >
> > Program Headers:
> >   Type           Offset             VirtAddr           PhysAddr
> >                  FileSiz            MemSiz              Flags  Align
> > ...
> >   LOAD           0x0000000000002000 0x0000000000400000 0x0000000000000000
> >                  0x0000000000001000 0x0000000000001000  R E    0x1000
> >
> >
> > F29:
> >
> > Program Headers:
> >   Type           Offset             VirtAddr           PhysAddr
> >                  FileSiz            MemSiz              Flags  Align
> > ...
> >   LOAD           0x0000000000002000 0x0000000000400000 0x0000000000000000
> >                  0x0000000000001000 0x0000000000001000  R      0x1000
> >   LOAD           0x0000000000003000 0x0000000000401000 0x0000000000000000
> >                  0x0000000000000000 0x0000000000001000  R E    0x1000
> >   LOAD           0x0000000000003000 0x0000000000402000 0x0000000000000000
> >                  0x0000000000000000 0x0000000000001000  R      0x1000
> >
> > The thing to observe here is that, on F29, .fini and .rodata get their
> > own headers.  On F28, a single header describes .init, .plt, .text,
> > .fini, and .rodata.
> >  
> 
> It has nothing to do with kernel.   It is a linker feature:
> 
>      'separate-code'
>      'noseparate-code'
>           Create separate code 'PT_LOAD' segment header in the object.
>           This specifies a memory segment that should contain only
>           instructions and must be in wholly disjoint pages from any
>           other data.  Don't create separate code 'PT_LOAD' segment if
>           'noseparate-code' is used.
> 
> [hjl@gnu-cfl-2 ~]$ ld --help | grep separate
>   -z separate-code            Create separate code program header (default)
>   -z noseparate-code          Don't create separate code program header
> [hjl@gnu-cfl-2 ~]$

Thanks for this info.  I had been wondering what change was made in
between binutils 2.29 and 2.31 to cause the change in behavior described
in my earlier post.  With the info you provided above, I found it in
bfd/ChangeLog-2018:

2018-02-27  H.J. Lu  <hongjiu.lu@intel.com>

	* config.in: Regenerated.
	* configure: Likewise.
	* configure.ac: Add --enable-separate-code.
	(DEFAULT_LD_Z_SEPARATE_CODE): New AC_DEFINE_UNQUOTED.  Default
	to 1 for Linux/x86 targets,
	* elf64-x86-64.c (ELF_MAXPAGESIZE): Set to 0x1000 if
	DEFAULT_LD_Z_SEPARATE_CODE is 1.

Also, regarding version numbers, I see this:

2018-06-24  Nick Clifton  <nickc@redhat.com>

	2.31 branch created.

So 2.31 would have been the first version for which --enable-separate-code
was made the default for Linux/x86.  As noted earlier, F28 used version 2.29
while F29 used version 2.31.

Kevin