Bug 23699

Summary: ihex output fails for mipsel-elf-objcopy
Product: binutils Reporter: rhn <sowaac.rhn>
Component: binutilsAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: nickc
Priority: P2    
Version: 2.31   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: Linker script used to create the un-ihex-able elf file
Workaround checking for signedness
Detect and reject 64-bit addresses.

Description rhn 2018-09-22 12:58:53 UTC
Created attachment 11268 [details]
Linker script used to create the un-ihex-able elf file

Trying to convert elf to ihex files fails:

$ mipsel-elf-gcc -nostdlib -T./link.ld ./main.c -o out.elf
$ mipsel-elf-objcopy -O ihex out.elf out.hex
mipsel-elf-objcopy: out.hex: address 0xffffffff9d005a00 out of range for Intel Hex file
mipsel-elf-objcopy:out.hex: bad value

The reason is that ihex only supports 32-bit values, and the elf32-littlempis bfd sign-extends vma addresses, causing them to go out of range.

The behaviour can be tested with the source file and attached linker script:

```
int main()
{
    return 0;
}
```
Comment 1 rhn 2018-09-22 13:10:16 UTC
Created attachment 11269 [details]
Workaround checking for signedness

This is the best workaround I could come up with (based on Pinguino.cc own workaround).

It has a flaw that by excluding the range 0x7fffffff..0xffffffff, it will break non-sign-extended (source) targets. a better solution would let the ihex printer know whether the original addresses were sign-extended.

The bfd_target struct could be extended, to carry some signedness about the input bfd.

Or, the ihex target could get a backend_data structure (currently null), copying the input backend.

Or, the ihex backend could get a special treatment somewhere inside copy_file. It's likely to be many places, and it likely would have to indicate that it needs such a treatment in a special way.
Comment 2 Nick Clifton 2018-10-01 16:33:03 UTC
Hi rhn,

(In reply to rhn from comment #1)
> Created attachment 11269 [details]
> Workaround checking for signedness

[Sorry I have not had time to look at this, but...]

> The bfd_target struct could be extended, to carry some signedness about the
> input bfd.


If the bfd_target's flavour is bfd_target_elf_flavour then you can
cast bfd target's backend field to a struct elf_backend_data pointer
(or use the xvec_get_elf_backend_data macro).  Then you can access the
sign_extend_vma field, which will be set for targets like the MIPS which
use signed address ranges.

Cheers
  Nick
Comment 3 rhn 2018-10-01 19:32:53 UTC
Hi Nick,

the target bfd is actually bfd_target_ihex_flavor, which is at the core of the issue.

One of the solutions I was thinking about was to create an ihex backend with only the signedness information in it. That would still require ihex-specific changes in the copy procedure somewhere though. I couldn't easily identify where those would be needed, but I'm happy to dig deeper if you think it's the way forward.

Thanks for taking a look,
rhn
Comment 4 Nick Clifton 2018-10-02 14:51:18 UTC
Hi rhn,

(In reply to rhn from comment #3)

> the target bfd is actually bfd_target_ihex_flavor, which is at the core of
> the issue.

Presumably this is the output bfd.  Do you not have access to 
the input bfd as well ?  For example, could you put a check in
ihex_set_section_contents() ?  (Following the asection->owner 
pointer).  Then you can make sure that the content conversion
is sign-extended as appropriate...

Cheers
  Nick
Comment 5 rhn 2018-10-02 20:27:21 UTC
Yes, the ihex bfd is the output bfd. Thank you for pointing me the right direction with ihex_set_section_contents().

It turns out that the last time the input section/bfd may be access is in copy_section(), when bfd_set_section_contents() is called. This, in turn, calls ihex_set_section_contents(), where no access to the input bfd is possible.

Seeing that the signature for both _set_section_contents() function contains input bfd data:

bfd_boolean
bfd_set_section_contents (bfd *abfd,
			  sec_ptr section,
			  const void *location,
			  file_ptr offset,
			  bfd_size_type count)

, then I think adding a new parameter `bool sign_extend_vma` might do the job, if inelegantly. The address would be set correctly inside ihex_set_section_contents() in this scenario.

What do you think about that?

Cheers,
rhn
Comment 6 Nick Clifton 2018-10-03 11:04:00 UTC
Hi rhn,

(In reply to rhn from comment #5)

> bfd_set_section_contents (bfd *abfd,
> 			  sec_ptr section,
> 			  const void *location,
> 			  file_ptr offset,
> 			  bfd_size_type count)
> 
> , then I think adding a new parameter `bool sign_extend_vma` might do the
> job, if inelegantly. The address would be set correctly inside
> ihex_set_section_contents() in this scenario.
> 
> What do you think about that?

Not a good idea IMHO, because it will mean changing a lot of code in order
to fix a very minor bug.  (I apologise if you do not consider this bug to
be minor, but from the perspective of all of the architectures and file 
formats supported by the BFD library, mips+ihex is a rather obscure 
combination).

I am beginning to think that the original patch might actually be the best
approach.  Perhaps with a small modification so that it only checks and
masks out the top 32-bits of the address.  Ie it ignores the sign bit in
bit-31 and clips all addresses to 32-bits, issuing a warning only if the
upper 32 bits are not identical.

Cheers
  Nick
Comment 7 rhn 2018-10-03 13:27:44 UTC
Created attachment 11288 [details]
Detect and reject 64-bit addresses.

Hi Nick,

I do not have emotional attachment to bugs, no need to apologise. I trust that you know whether this one highlights a bigger issue in the architecture or can be fixed locally.

Here's a new version of the patch. It does two things to determine whether the address was 64-bit:
- checks whether top 32 bits are the same
- removes addresses starting with top 32 bit set but bit 31 unset, as they couldn't have come from valid 32-bit addresses: sign extension would set the top to 0s, no sign extension they would have to be 0s already

This strategy will leave false negatives, since both 0x00000000ffffffff and 0xffffffffffffffff are always valid addresses now.

Since seeing impossible addresses means that 64-bit code has been provided, I decided to keep the error instead of the warning.

There's another possible way to tackle this. The code could match against bfd->arch_info.arch, and figure out sign extension behaviour from that somehow? Or even embed that flag in bfd_arch_info?
Comment 8 rhn 2018-10-23 16:18:20 UTC
Hi,

I just wanted to check if there's anything else I can do to help fix this bug.
Comment 9 Nick Clifton 2018-10-24 07:55:33 UTC
Hi,

> I just wanted to check if there's anything else I can do to help fix this bug.

No - just keep pinging me.  I am actually on vacation at the moment,
but I will keep this email in my queue and get back tot he problem once I
return to work.

Cheers
  Nick
Comment 10 Sourceware Commits 2018-11-09 14:10:55 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d2eb0fb5a04e2e2ad374f56c7107f2421cdd93c9

commit d2eb0fb5a04e2e2ad374f56c7107f2421cdd93c9
Author: rhn <sowaac.rhn@porcupinefactory.org>
Date:   Fri Nov 9 14:09:44 2018 +0000

    Stop corruption of ihex output shen addresses are sign extended.
    
    	PR 23699
    	* ihex.c (ihex_write_object_contents): Check for sign extended
    	addresses that cannot be supported in the ihex format.
Comment 11 Nick Clifton 2018-11-09 14:14:08 UTC
Hi rhn,

  Right - sorry for the delay.  I have now checked in your patch from comment
  #7, which I think is probably the best solution that we will get for this
  problem.  Thanks very much for persisting with this issue.

Cheers
  Nick
Comment 12 rhn 2018-11-09 21:02:19 UTC
Thank you for letting this change go through! I can't wait to test out a new release :)
Comment 13 Sourceware Commits 2019-01-08 12:02:07 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a9859e01726d085db79cff88550fdb38e2434e42

commit a9859e01726d085db79cff88550fdb38e2434e42
Author: Alan Modra <amodra@gmail.com>
Date:   Tue Jan 8 22:21:57 2019 +1030

    PR24065, 32-bit objcopy fails with 64-bit address ... out of range
    
    	PR 23699
    	PR 24065
    	* ihex.c (ihex_write_object_contents): Properly check 32-bit
    	address range.