Bug 23595 - "simple objcopy of executable" failure for msp430-elf with -mlarge
Summary: "simple objcopy of executable" failure for msp430-elf with -mlarge
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.31
: P2 normal
Target Milestone: 2.32
Assignee: Alan Modra
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-30 13:20 UTC by Jozef Lawrynowicz
Modified: 2018-09-04 12:37 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed: 2018-09-03 00:00:00


Attachments
testprog (48.22 KB, application/x-executable)
2018-08-30 13:20 UTC, Jozef Lawrynowicz
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jozef Lawrynowicz 2018-08-30 13:20:48 UTC
Created attachment 11217 [details]
testprog

The attached "testprog" was built by the testsuite. "copyprog" was generated by
"objcopy testprog copyprog".
These commands can be used to replicate how the testsuite builds the program:

msp430-elf-gcc -mlarge -c -msim -o testglue.o /usr/share/dejagnu/testglue.c
msp430-elf-gcc -mlarge -c -msim -o testprog.o \
  binutils-gdb/binutils/testsuite/binutils-all/testprog.c
msp430-elf-gcc -mlarge testprog.o testglue.o -Wl,-wrap,exit -Wl,-wrap,_exit \
  -Wl,-wrap,main -Wl,-wrap,abort -msim -Lbinutils-gdb/build-msp430/ld -g -lm -o testprog
msp430-elf-objcopy testprog copyprog

Trimmed readelf -lS output is below

testprog
--
Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 2] .rodata           PROGBITS        00002000 000134 000090 00   A  0   0  2
  [ 7] .bss              NOBITS          0000063e 000136 00002c 00  WA  0   0  2
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00000508 0x00000508 0x00134 0x00162 RW  0x4
  LOAD           0x000134 0x00002000 0x00002000 0x00090 0x00090 R   0x4

copyprog
--
Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 2] .rodata           PROGBITS        00002000 000138 000090 00   A  0   0  2
  [ 7] .bss              NOBITS          0000063e 000136 00002c 00  WA  0   0  2
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00000508 0x00000508 0x00136 0x00162 RW  0x4
  LOAD           0x000138 0x00002000 0x00002000 0x00090 0x00090 R   0x4

testprog and copyprog
---
 Section to Segment mapping:
  Segment Sections...
   00     .bss
   01     .rodata

Incorrect values in copyprog:
- File offset for .rodata (0x134 -> 0x138)
- File size of the first segment (0x134 -> 0x136)
- File offset of second segment (0x134 -> 0x138)

Since the first segment contains the program headers and the SHT_NOBITS section
".bss" only, it contains no loadable sections. This means the segment is marked
as "no_contents", (see elf.c:5450) and any increase of the file offset (to make
the segment obey the gABI requirement for VMA page alignemnt), is then taken
away from the file offset counter (off), which is used to place subsequent
sections/segments.
In this case, the offset of .bss is adjusted from 0x134 to 0x136 in the linker,
then the offset counter is decremented back to 0x134 (because the segment has
no_contents), so the next section (in the next segment), .rodata, is placed at
0x134.
When objcopy comes to assign_file_positions_for_load_sections, it thinks the
size of the program headers (m->header_size) is 0x136, so .bss will be
considered for placement at offset 0x136, not 0x134 as we get in the linker.
It can be placed at 0x136 without any off_adjust value required, so .rodata is
considered for placement at 0x136.

The header size is originally set in elf.c:7351 (copy_elf_program_header).
  map->header_size = lowest_section->vma - segment->p_vaddr

When segment->p_vaddr is finalized (in the linker, bfd/elf.c:5506), it uses the
file offset counter, which includes the added alignment (off_adjust)
required for .bss.
In this case, the above calculation is essentially
  map->header_size = lowest_section->vma - (lowest_section->vma - bss->sh_offset)
  map->header_size = bss->sh_offset

So this calculation is innacurate, as it assumes the lowest_section VMA is
immediately after the program headers, when in fact the lowest_section may have
been aligned, and so does not start immediately after the headers.

If the first segment containing the program headers does have contents, then this bug
doesn't occur. Even if the header_size is technically innacurate, it doesn't
matter because there is no off_adjust value that the linker used to change the
offset of the second segment, unbeknownst to objcopy, so it is fine to use this
header_size value as the starting point for the file offset of sections.
Comment 1 Jozef Lawrynowicz 2018-08-30 13:23:52 UTC
Proposed patch, lightly tested without regressions in the binutils, gas, ld testsuites for msp430-elf/-mlarge and x86_64-pc-linux-gnu.

diff --git a/bfd/elf.c b/bfd/elf.c
index b8860c4..14612bd 100644
--- a/bfd/elf.c
+++ b/bfd/elf.c
@@ -7235,6 +7235,7 @@ copy_elf_program_header (bfd *ibfd, bfd *obfd)
       Elf_Internal_Shdr *this_hdr;
       asection *first_section = NULL;
       asection *lowest_section;
+      bfd_boolean no_contents = TRUE;
 
       /* Compute how many sections are in this segment.  */
       for (section = ibfd->sections, section_count = 0;
@@ -7246,6 +7247,8 @@ copy_elf_program_header (bfd *ibfd, bfd *obfd)
            {
              if (first_section == NULL)
                first_section = section;
+             if (elf_section_type (section) != SHT_NOBITS)
+               no_contents = FALSE;
              section_count++;
            }
        }
@@ -7342,8 +7345,15 @@ copy_elf_program_header (bfd *ibfd, bfd *obfd)
        }
 
       if (map->includes_filehdr && lowest_section != NULL)
-       /* We need to keep the space used by the headers fixed.  */
-       map->header_size = lowest_section->vma - segment->p_vaddr;
+       {
+         /* We need to keep the space used by the headers fixed.
+            If no_contents is true, get a more accurate value for header_size
+            by checking the p_filesz of the segment.  */
+         if (no_contents)
+           map->header_size = segment->p_filesz;
+         else
+           map->header_size = lowest_section->vma - segment->p_vaddr;
+       }
 
       if (!map->includes_phdrs
          && !map->includes_filehdr
Comment 2 Alan Modra 2018-09-03 03:33:03 UTC
Patch looks good.  I'm going to apply it with a change to use file offset rather than vma here, since you have dealt with the nobits case.
Comment 3 Sourceware Commits 2018-09-03 08:36:01 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=2542e49e21e6b7270f72c7be0fc4ff1a986371da

commit 2542e49e21e6b7270f72c7be0fc4ff1a986371da
Author: Jozef Lawrynowicz <jozef.l@mittosystems.com>
Date:   Mon Sep 3 11:04:05 2018 +0930

    PR23595, simple objcopy of executable failure for msp430-elf
    
    VMA of the first section in the segment containing the ELF file header
    (and possibly section headers too) can't be used to reliably find the
    size of the headers plus padding.  What's really needed is sh_offset
    of the first section assuming it has contents (vma does have a
    relationship to sh_offset, but is only guaranteed in demand paged
    executables).
    
    If the first section is SHT_NOBITS and it hasn't been converted to
    have file contents by the existence of a following SHT_PROGBITS
    section in the same segment, the sh_offset value also isn't reliable.
    
    	PR 23595
    	elf.c (copy_elf_program_header): When first segment contains
    	only the headers and SHT_NOBITS sections, use segment p_filesz
    	to calculate header and padding size.  Use filepos of the first
    	section otherwise.
Comment 4 Alan Modra 2018-09-03 08:53:59 UTC
Patch applied
Comment 5 Jozef Lawrynowicz 2018-09-04 12:37:26 UTC
Thanks!