Bug 28873 - Implement eu-readelf -D
Summary: Implement eu-readelf -D
Status: RESOLVED FIXED
Alias: None
Product: elfutils
Classification: Unclassified
Component: tools (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Di Chen
URL:
Keywords:
Depends on: 28928
Blocks:
  Show dependency treegraph
 
Reported: 2022-02-08 14:41 UTC by Di Chen
Modified: 2023-05-19 15:33 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
readelf: display dynamic symtab without section headers (3.20 KB, patch)
2023-01-16 13:30 UTC, Di Chen
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Di Chen 2022-02-08 14:41:02 UTC
Currently, eu-readelf is using section headers (in readelf.c:handle_symtab(),
it calls gelf_getshdr() for fetching section headers and get related info like
section size, type, etc) to find the symbol information and print them.
 
This task will add new options to eu-readelf (-D and --use-dynamic).
 
And this task aims for printing the symbols by lookup through the program headers
and the dynamic table. PT_DYNAMIC->DT_* for DT_{GNU_,}HASH, DT_SYMTAB, DT_STRSZ,
DT_VERNEED, DT_VERSYM(for version information), etc.
 
previous discussion: https://bugzilla.redhat.com/show_bug.cgi?id=444621
Comment 1 Di Chen 2022-03-31 13:07:48 UTC
This Bug depends on Bug 28928 because when calling 
`$ eu-readelf -d --use-dynamic {FILE}`
eu-readelf will need the number of dynamic section entris which involves with Bug 28928
Comment 2 Di Chen 2022-08-08 09:33:45 UTC
commit 369c021c6eedae3665c1dbbaa4fc43afbbb698f4
Author: Di Chen <dichen@redhat.com>
Date:   Thu Apr 28 19:55:33 2022 +0800

    readelf: Support --dynamic with --use-dynamic
    
    Currently, eu-readelf is using section headers to dump the dynamic
    segment information (print_dynamic -> handle_dynamic).
    
    This patch adds new options to eu-readelf (-D, --use-dynamic)
    for (-d, --dynamic).
    
    https://sourceware.org/bugzilla/show_bug.cgi?id=28873
    
    Signed-off-by: Di Chen <dichen@redhat.com>
Comment 3 Di Chen 2022-08-08 09:38:44 UTC
Finished the first part of the whole task, now eu-readelf can dump the dynamic segment information from dynamic program header, i.e.

```
$ ./src/readelf -Dd ~/test/eu-readelf-no-shdr 

Dynamic segment contains 26 entries:
 Addr: 0x0000000000474e00  Offset: 0x073e00
  Type              Value
  NEEDED            Shared library: [libdw.so.1]
  NEEDED            Shared library: [libelf.so.1]
  NEEDED            Shared library: [libc.so.6]
  INIT              0x0000000000404000
  FINI              0x000000000043d4a8
  INIT_ARRAY        0x0000000000474df0
  INIT_ARRAYSZ      8 (bytes)
  FINI_ARRAY        0x0000000000474df8
  FINI_ARRAYSZ      8 (bytes)
  GNU_HASH          0x00000000004003a0
  STRTAB            0x00000000004016b0
  SYMTAB            0x00000000004003f0
  STRSZ             3086 (bytes)
  SYMENT            24 (bytes)
  DEBUG             
  PLTGOT            0x0000000000475000
  PLTRELSZ          4560 (bytes)
  PLTREL            RELA
  JMPREL            0x00000000004026a0
  RELA              0x0000000000402610
  RELASZ            144 (bytes)
  RELAENT           24 (bytes)
  VERNEED           0x0000000000402450
  VERNEEDNUM        3
  VERSYM            0x00000000004022be
  NULL              

```
Comment 4 Di Chen 2022-08-08 09:42:08 UTC
Update:

I am working on the second part of the task, make eu-readelf can dump symbol information from dynamic program header, like $ eu-readelf -Ds {FILE}
Comment 5 Di Chen 2023-01-16 13:30:16 UTC
Created attachment 14600 [details]
readelf: display dynamic symtab without section headers

```
# failed to print symtab of a binary which has section headers removed.
$ ./src/readelf -s ~/test/a.out

# it works with "-D" for a binary which has section headers removed.
$ ./src/readelf -Ds ~/test/a.out 
    0: 0000000000000000      0 NOTYPE  LOCAL  DEFAULT    UNDEF 
    1: 0000000000000000      0 FUNC    GLOBAL DEFAULT    UNDEF __libc_start_main@GLIBC_2.34 (2)
    2: 0000000000000000      0 NOTYPE  WEAK   DEFAULT    UNDEF __gmon_start__
```

It works well for binaries which has SYMTAB section right after STRTAB section.
```
$ readelf -Dd ~/test/a.out 

Dynamic section at offset 0x2e60 contains 20 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x401000
 0x000000000000000d (FINI)               0x401124
 0x0000000000000019 (INIT_ARRAY)         0x403e50
 0x000000000000001b (INIT_ARRAYSZ)       8 (bytes)
 0x000000000000001a (FINI_ARRAY)         0x403e58
 0x000000000000001c (FINI_ARRAYSZ)       8 (bytes)
 0x000000006ffffef5 (GNU_HASH)           0x4003c0
 0x0000000000000005 (STRTAB)             0x400428
 0x0000000000000006 (SYMTAB)             0x4003e0
 0x000000000000000a (STRSZ)              55 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000015 (DEBUG)              0x0
 0x0000000000000007 (RELA)               0x400488
 0x0000000000000008 (RELASZ)             48 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x000000006ffffffe (VERNEED)            0x400468
 0x000000006fffffff (VERNEEDNUM)         1
 0x000000006ffffff0 (VERSYM)             0x400460
 0x0000000000000000 (NULL)               0x0
```
Because I use offset difference between SYMTAB and STRTAB to get the symbol table entry number.
```
  size_t syments = ((offs[i_strtab] - offs[i_symtab]) /
    gelf_fsize(ebl->elf, ELF_T_SYM, 1, EV_CURRENT));
```
Comment 6 Di Chen 2023-01-16 13:39:37 UTC
[Follow the last comment]

For a binary with SYMTAB STRTAB having different order, like upside down, or having other section in between. eg.

```
$ readelf -Dd /usr/local/go/bin/go

Dynamic section at offset 0x9e5220 contains 19 entries:
  Tag        Type                         Name/Value
 0x0000000000000004 (HASH)               0xb3c3c0
 0x0000000000000006 (SYMTAB)             0xb3c880
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000005 (STRTAB)             0xb3c660
 0x000000000000000a (STRSZ)              531 (bytes)
 0x0000000000000007 (RELA)               0xb3bfd8
 0x0000000000000008 (RELASZ)             24 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x0000000000000003 (PLTGOT)             0xde5100
 0x0000000000000015 (DEBUG)              0x0
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000006ffffffe (VERNEED)            0xb3c360
 0x000000006fffffff (VERNEEDNUM)         2
 0x000000006ffffff0 (VERSYM)             0xb3c300
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000002 (PLTRELSZ)           768 (bytes)
 0x0000000000000017 (JMPREL)             0xb3bff0
 0x0000000000000000 (NULL)               0x0

```

It will mess up the syments (symbol table entry number) calculation.
Comment 7 Aaron Merey 2023-02-03 18:24:40 UTC
(In reply to Di Chen from comment #5)
> Because I use offset difference between SYMTAB and STRTAB to get the symbol
> table entry number.
> ```
>   size_t syments = ((offs[i_strtab] - offs[i_symtab]) /
>     gelf_fsize(ebl->elf, ELF_T_SYM, 1, EV_CURRENT));
> ```
>
> For a binary with SYMTAB STRTAB having different order, like upside down, or
> having other section in between. eg.
> [...]
> It will mess up the syments (symbol table entry number) calculation.

This raises an interesting question: how do you calculate the number of symbols in .dynsym without using section headers?

I figured there'd some kind of "DT_SYMTABNUM" value somewhere but unfortunately the answer doesn't appear to be so straightforward.

Judging from the binutils readelf source code you need to use information in the .hash and .gnu.hash sections to calculate the number of entries. 

To complicate things even more, a binary can contain either .hash or .gnu.hash or both and computing the number of .dynsym entries is different in each case. See binutils/readelf.c:get_num_dynamic_syms, you may need to implement some of this in your patch.
Comment 8 Mark Wielaard 2023-02-04 17:02:58 UTC
(In reply to Aaron Merey from comment #7) 
> This raises an interesting question: how do you calculate the number of
> symbols in .dynsym without using section headers?
> 
> I figured there'd some kind of "DT_SYMTABNUM" value somewhere but
> unfortunately the answer doesn't appear to be so straightforward.

It has been proposed, but not (yet) adopted:
https://groups.google.com/g/generic-abi/c/9L03yrxXPBc
(sorry, a google groups link, there should be a normal archive, but I cannot find it right now). If that was adopted and linkers would generate it, then this question would indeed have a simple answer. Sadly, it isn't :{

> Judging from the binutils readelf source code you need to use information in
> the .hash and .gnu.hash sections to calculate the number of entries. 
> 
> To complicate things even more, a binary can contain either .hash or
> .gnu.hash or both and computing the number of .dynsym entries is different
> in each case. See binutils/readelf.c:get_num_dynamic_syms, you may need to
> implement some of this in your patch.

If there is a .hash section then it is fairly easy, the first word is the number of symbols the hash/symbol table describes.

If it is a .gnu.hash section then sadly you have to parse and go through the whole hashtable and count.

There is an implementation already in elfutils, but it is a bit hiden and obscure if you don't know what you are looking for. Search for "Figure out the size of the symbol table" in libdwfl/dwfl_module_getdwarf.c.
Comment 9 Mark Wielaard 2023-04-18 19:51:31 UTC
commit 4d8de4b2fa05495d69d09e1a3d335f24d6bf33ee
Author: Di Chen <dichen@redhat.com>
Date:   Mon Mar 27 10:01:05 2023 +0800

    readelf: display dynamic symtab without section headers
    
    This commit adds a new option "-D/--use-dynamic" to support printing the
    dynamic symbol table from the PT_DYNAMIC segment. By using the
    PT_DYNAMIC segment, eu-readelf can go through the contents of dynamic
    section entries and the values of each tag. From that, we can get the
    address and size of the dynamic symbol table, the address of the string
    table, etc.
    
    By using the new option "-D/--use-dynamic", eu-readelf can list the
    symbols without section headers.
    
    Example:
      $ ./src/readelf -Ds a.out
          0: 0000000000000000      0 NOTYPE  LOCAL  DEFAULT    UNDEF
          1: 0000000000000000      0 FUNC    GLOBAL DEFAULT    UNDEF __libc_start_main@GLIBC_2.34 (2)
          2: 0000000000000000      0 NOTYPE  WEAK   DEFAULT    UNDEF __gmon_start__
    
    https://sourceware.org/bugzilla/show_bug.cgi?id=28873
    
    Signed-off-by: Di Chen <dichen@redhat.com>