Bug 25202 - objcopy --verilog-data-width doesn't respect target's endianness
Summary: objcopy --verilog-data-width doesn't respect target's endianness
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.33
: P2 normal
Target Milestone: ---
Assignee: Nick Clifton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-11-18 11:21 UTC by Olof Kindgren
Modified: 2024-02-27 22:25 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-11-03 00:00:00


Attachments
Proposed patch (2.92 KB, patch)
2022-11-03 13:50 UTC, Nick Clifton
Details | Diff
Proposed patch (3.03 KB, patch)
2022-11-03 16:24 UTC, Nick Clifton
Details | Diff
Proposed patch (1.75 KB, patch)
2022-11-08 11:41 UTC, Nick Clifton
Details | Diff
Proposed patch (2.05 KB, patch)
2022-11-21 12:13 UTC, Nick Clifton
Details | Diff
Proposed patch (2.29 KB, patch)
2022-11-30 11:25 UTC, Nick Clifton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Olof Kindgren 2019-11-18 11:21:18 UTC
I just tried binutils 2.33.1 compiled for riscv32-unknown-elf to test https://sourceware.org/bugzilla/show_bug.cgi?id=19921. Unfortunately it doesn't match what I was expecting to see.

Problem is that it will always use big endian output since (I think) the bfd_target for verilog has BFD_ENDIAN_UNKNOWN. I was expecting it to use the endianness of the source architecture.

I got around it by setting target byte order of the verilog_vec bfd_target to BFD_ENDIAN_LITTLE but I don't have enough experience with binutils to see where the correct fix should be.
Comment 1 Henrik Brix Andersen 2020-08-03 19:52:38 UTC
I can confirm this bug. I too was trying out the --verilog-data-width option, here on arm-zephyr-eabi. This is a little endian target, but the Verilog memory hex dump produced is in big-endian format.
Comment 2 Alex Underwood 2021-10-21 16:45:32 UTC
Hello,

I can also confirm this bug (still present in 2.37.50), and because the verilog output functionality is relevant to my work I figured I would take a crack at fixing it.

As Olof said, the issue does seem to be because the verilog bfd_target uses BFD_ENDIAN_UNKNOWN, so the little-endianness check in the verilog_write_record function always fails and falls through to big-endian.  That being said, verilog_write_record does support big- and little-endian, it just never reaches the little-endian section.

Seeing that bfd_targets are constants and passing the input bfd's endianness to the verilog handler would be a huge code change for what only a single format needs, would the best way to fix this be to create 2 verilog bfd formats?  For example, a verilog-be and verilog-le format that use BFD_ENDIAN_BIG and BFD_ENDIAN_LITTLE respectively.  This would leave it up to the user to specify the correct endianness for their given input bfd but at the same time I feel would help to alleviate an issue I've seen in several RISC-V projects in particular that take advantage of this bug and use objcopy as a way of switching endianness rather than just compiling the original files with the correct endianness setting at the GCC level.

If that sounds like a reasonable approach, I'll get going on a patch to set that up.  Though, that does leave the question of what to do with the current "verilog" target if it ends up split into 2 separate targets.
Comment 3 Liwei.Ma 2022-01-28 04:25:56 UTC
Hi, Alex,

I also confirm this bug, while my version seems to be stuck at little endian. I am using ubuntu 20.04.3 package, aarch64-linux-gnu-objcopy (2.34), aarch64-linux-gnu-gcc (9.3.0).

And I would report another issue that, the Verilog hex address is for Verilog readmemh, not for OS byte based address. When the -verilog-width=N(byte), the address should be `address / N`. For example, when Verilog-width=8(byte), and the Verilog mem being `reg[63:0] mem[depth]`, the addresses in hex file should be divided by 8.

Another convenience issue is for objcopy add an option of `base address`. This should be implemented in readmemh, but it is more convenient here too. The current hex output uses the address in a whole space, while Verilog mem is usually small. It is convenient to subtract a base address when output Verilog hex file. For example, the object address is at 0x004000b0, and option `base address` is set 0x00400000, then the output hex file start at 0x00b0. A small Verilog memory can hold the object data.

I am willing to offer some help in the development given me some basic education of objectcopy and detailed tasks.

Best regards,
Liwei Ma.

IMPORTANT NOTICE: The contents of this email and any attachments may be privileged and confidential. If you are not the intended recipient, please delete the email immediately. It is strictly prohibited to disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. (c)Arm Technology (China) Co., Ltd copyright and reserve all rights. ????????????????????????????????????????????????????????????????????????????????????????????????????????????????? (c)???????????? ????????????
Comment 4 Gökçe Aydos 2022-11-02 15:07:53 UTC
I had the same problem like Olof while copying a little-riscv32 a verilog target:


```
$ riscv64-elf-objcopy my.elf -O verilog --verilog-data-width=4 my.hex
$ head -2 my.hex
@80000000
B7C0EDFE 9380D0EA 37E176FF 130161F5
```

I used a oneliner as a workaround:
```
sed -E 's/([^@].)(..)(..)(..)([ $])/\4\3\2\1\5/g' my.hex > my-le.hex
```

> Seeing that bfd_targets are constants and passing the input bfd's endianness to the verilog handler would be a huge code change for what only a single format needs, would the best way to fix this be to create 2 verilog bfd formats? ...

I do not have any experience of binutils source code, but I read in the manual:

> If the input format has an endianness (some formats do not), objcopy can only copy the inputs into file formats that have the same endianness

So if the output format cannot dynamically change endianness, then Alex's proposal makes sense.

Is someone working on this?
Comment 5 Nick Clifton 2022-11-03 13:38:55 UTC
(In reply to gökçe from comment #4)
 
> Is someone working on this?

Sorry - I managed to drop this PR from my queue.  I am working on a patch now.

Cheers
  Nick
Comment 6 Nick Clifton 2022-11-03 13:50:23 UTC
Created attachment 14432 [details]
Proposed patch

Please can somebody try out this patch ?

It *might* work, but all this endian-ness stuff is doing my head in.

The patch adds a new command line option to objcopy:

  --verilog-data-endianness=input

which tells the verilog converter to use the endian-ness of the input file, instead of the default endian-ness.  (Which is set to 'unknown' and by the logic in bfd/verilog.c:verilog_write_record() equates to 'big-endian').

The patch also adds --verilog-data-endianness=big and --verilog-data-endianness=little although I am not sure if these will actually be useful.
Comment 7 Nick Clifton 2022-11-03 16:24:27 UTC
Created attachment 14433 [details]
Proposed patch

Oops - forgot to include the new test.  Here is the updated patch.
Comment 8 Gökçe Aydos 2022-11-05 21:44:15 UTC
Thanks for the fast response :)

```
--- /dev/null	2022-11-03 08:18:45.269001160 +0000
+++ 	2022-11-03 13:29:21.965562457 +0000
```

I suppose the testname is missing Nick.
Comment 9 Gökçe Aydos 2022-11-06 08:17:37 UTC
I tested the patch (without the test). My remarks:

1) AFAIK memory addresses are byte addresses in gcc. If --verilog-data-width is greater than 1, then each word in the pattern file becomes something else than a byte. From the 2017 Systemverilog LRM:

> As the file is read, each number encountered is assigned to a successive word element of the memory.

This means for example:
```
$ riscv64-elf-objcopy my.elf -O verilog --verilog-data-width=4 my.hex --verilog-data-endianness=input
$ head -2 my.hex
@80000000
FEEDC0B7 EAD08093 FF76E137 F5610113
```

If I load these data using `$readmemh`, then `FEEDC0B7` is stored at address `8000_0000` (which is correct), but `EAD08093` is stored at `8000_0001` instead of `8000_0004`:

```
Disassembly of section .text.init:

80000000 <rvtest_entry_point>:
80000000:	feedc0b7          	lui	x1,0xfeedc
80000004:	ead08093          	addi	x1,x1,-339 # feedbead
```

So additional addressing lines must be generated if --verilog-data-width is used.

2) wording: endian-ness and endianness seem to be used interchangeably. Wikipedia uses `endianness`

3) I do not see any advantage of `--verilog-data-endianness={big,little}`. If the input was compiled for an architecture, I expect that the developer wants to use the output of objcopy for the same architecture (given that the chosen architecture works only with a single endianness).

Furthermore I understand that objcopy does not *want* to change the endianness. From the manual:

> Note---objcopy is not able to change the endianness of its input files.  If the input format has an endianness (some formats do not), objcopy can only copy the inputs into file formats that have the same endianness or which have no endianness (e.g., srec). (However, see the --reverse-bytes option.)
Comment 10 Nick Clifton 2022-11-08 11:40:32 UTC
(In reply to Gökçe Aydos from comment #9)
> I tested the patch (without the test). My remarks:

Thanks for testing it.
 
> 1) AFAIK memory addresses are byte addresses in gcc. If --verilog-data-width
> is greater than 1, then each word in the pattern file becomes something else
> than a byte. From the 2017 Systemverilog LRM:
> 
>> As the file is read, each number encountered is assigned to a successive word element of the memory.

In that sentence, what exactly is meant by "word" ?

> This means for example:
> ```
> $ riscv64-elf-objcopy my.elf -O verilog --verilog-data-width=4 my.hex
> --verilog-data-endianness=input
> $ head -2 my.hex
> @80000000
> FEEDC0B7 EAD08093 FF76E137 F5610113
> ```
> 
> If I load these data using `$readmemh`, then `FEEDC0B7` is stored at address
> `8000_0000` (which is correct), but `EAD08093` is stored at `8000_0001`
> instead of `8000_0004`:

But isn't this a problem with how you are using the my.hex file ?  If it has
been created with a data width of 4, then the expectation is that it will be
loaded using a mechanism that advances the address by 4 after each reading
each group of 4 bytes ?  So if you want to use `$readmemh` in the way you have
shown, then you *have* to use --verilog-data-width=1.  (Or just leave it
undefined, since a width of 1 is the default).

I admit that I am not an expert on the verilog file format, but that code in the
BFD library that creates the output has been there for a long time, and if widths
greater than 1 did not work, I would have expected to have seen bug reports about
it before now...


 
> 2) wording: endian-ness and endianness seem to be used interchangeably.
> Wikipedia uses `endianness`

Yeah - I was never sure of the correct spelling, but now that you have
pointed it out, I can correct the sources.


> 3) I do not see any advantage of `--verilog-data-endianness={big,little}`.
> If the input was compiled for an architecture, I expect that the developer
> wants to use the output of objcopy for the same architecture (given that the
> chosen architecture works only with a single endianness).

Yeah - I was unsure about them too.  But there are bi-endian architectures,
so maybe there is a need.  It does seem to be a stretch however.


> Furthermore I understand that objcopy does not *want* to change the
> endianness. From the manual:
> 
>> Note---objcopy is not able to change the endianness of its input files.  If the input format has an endianness (some formats do not), objcopy can only copy the inputs into file formats that have the same endianness or which have no endianness (e.g., srec). (However, see the --reverse-bytes option.)

That sentence is generally true, but I think that the verilog file format is a 
special case, like the srec format.  

But overall I think that I agree with you.  It would be better to simplify the
patch and remove the big/little endian options.  I will attach an updated patch.

Cheers
  Nick
Comment 11 Nick Clifton 2022-11-08 11:41:04 UTC
Created attachment 14438 [details]
Proposed patch

Updated patch
Comment 12 Gökçe Aydos 2022-11-08 22:10:52 UTC
(In reply to Nick Clifton from comment #10)

> >> As the file is read, each number encountered is assigned to a successive word element of the memory.
> 
> In that sentence, what exactly is meant by "word" ?

Simply an array of bits. In other words the memory can store data of arbitrary width at each address. You have the freedom.

> But isn't this a problem with how you are using the my.hex file ?  If it has
> been created with a data width of 4, then the expectation is that it will be
> loaded using a mechanism that advances the address by 4 after each reading
> each group of 4 bytes ?  So if you want to use `$readmemh` in the way you
> have
> shown, then you *have* to use --verilog-data-width=1.  (Or just leave it
> undefined, since a width of 1 is the default).

I thought that the objcopy output is for convenient data exchange between gcc and $readmemh. My hunch is: if the developer has to write a custom data importer function that can parse the '@address' lines, then they would rather write their own Verilog-objcopy with convenient features.

> I admit that I am not an expert on the verilog file format, but that code in
> the
> BFD library that creates the output has been there for a long time, and if
> widths
> greater than 1 did not work, I would have expected to have seen bug reports
> about
> it before now...

Yeah I thought about that too, when I was writing my remark :) Your point makes sense. I just started using the Verilog output, so take my comments with a grain of salt.
Comment 13 Olof Kindgren 2022-11-08 22:56:48 UTC
Great to see some activity on this! I haven't compiled and tested it myself yet, but I'm wondering if we got the addressing right. Nick, it's correct that we are dealing with word addresses, but are we taking that into consideration when we're calculating the base address?

I.e. compiling an asm program that starts with .org 0x100, will that cause the address to be set to @40 when using verilog-data-width=4 ?

Generally, I think you shouldn't trust VerilogDataWith part of the verilog output too much even if it has been there for a long time. I believe that practically no one has actually used this feature much because of this bug (and https://sourceware.org/bugzilla/show_bug.cgi?id=19921 before that) so it's likely not very well tested.

Liwei Ma, we can already change the base address like you asked for with --change-addresses (you can find an example here https://github.com/chipsalliance/Cores-SweRVolf/blob/master/sw/Makefile#L27)
Comment 14 Gökçe Aydos 2022-11-09 09:48:54 UTC
(In reply to Olof Kindgren from comment #13)
> ... we are dealing with word addresses, but are we taking that into
> consideration when we're calculating the base address?
> 
> I.e. compiling an asm program that starts with .org 0x100, will that cause
> the address to be set to @40 when using verilog-data-width=4 ?

Oh, I think that is the actual solution to the word addressing problem (compared to my previous comment). Just shift the @addresses n = log2(verilog-data-width) to the right, if they have n zeroes in the least significant bits. If not, then they should be padded with zeroes.

> I believe that practically no one has actually used this feature much because of this bug

I weakly remember that someone from the RISC-V community community hacked their own C program to convert the .elf to a desired Verilog input format.

Thanks for chiming in Olof!
Comment 15 Nick Clifton 2022-11-09 12:27:24 UTC
(In reply to Gökçe Aydos from comment #14)

>> I.e. compiling an asm program that starts with .org 0x100, will that cause
>> the address to be set to @40 when using verilog-data-width=4 ?
> 
> Oh, I think that is the actual solution to the word addressing problem
> (compared to my previous comment). Just shift the @addresses n =
> log2(verilog-data-width) to the right, if they have n zeroes in the least
> significant bits. If not, then they should be padded with zeroes.

OK, just to make sure that I understand this correctly...

  * If the start address is 0x100 and the data width is 1 then
    the address in the output file should be "@100".

  * If the start address is 0x111 and the data width is 1 then
    the address in the output file should be "@111".

  * If the start address is 0x100 and the data width is 4 then
    the address in the output file should be "@40".  (ie 0x100 / 4)

  * If the start address is 0x111 and the data width is 4 then
    the address in the output file should be "@44" and the first
    byte should be a padding byte of 00. 
   
Is that right ?
Comment 16 Olof Kindgren 2022-11-09 16:31:02 UTC
>> I believe that practically no one has actually used this feature much because of this bug
>
>I weakly remember that someone from the RISC-V community community hacked their own C program to convert the .elf to a desired Verilog input format.

I think there are about one million different ten-line utilities for converting bin files to verilog. I have probably at least one in each project I'm working on myself. Very much looking forward to not need those anymore. :)

>OK, just to make sure that I understand this correctly...
>
>  * If the start address is 0x100 and the data width is 1 then
>    the address in the output file should be "@100".
>
Correct!

>  * If the start address is 0x111 and the data width is 1 then
>    the address in the output file should be "@111".
>
Correct!

>  * If the start address is 0x100 and the data width is 4 then
>    the address in the output file should be "@40".  (ie 0x100 / 4)
>
Correct!

>  * If the start address is 0x111 and the data width is 4 then
>    the address in the output file should be "@44" and the first
>    byte should be a padding byte of 00. 
>
Correct!

>Is that right ?
4 out of 4! :)

Same applies to endings where we need to zero-pad. Now, there's a nasty corner case here. I'm not super familiar with sections and stuff, but imagine there's a section (segment?) ending at address 0x9 and then another one starting at address 0xb. With 4-byte words and padding the first section would be
@0
11223344 55667788 9900000

and the second one
@2
aabbccdd eeff....

meaning that aabbccdd will overwrite 99000000 (I think. Haven't read the verilog spec close enough tbh). Now, in practice I think this will happen very rarely, but I wonder if we should only allow sections to start at word boundraries. As a user I would prefer a limitation with a helpful error message rather than silently overwriting parts of my code. Probably also keeps down complexity of the code. From my earlier looks at the objcopy stuff, it's really built to work on byte-basis so we're already stretching it a bit.
Comment 17 Nick Clifton 2022-11-21 12:12:31 UTC
(Sorry for the delay in replying - I have been distracted by other work).

I think that we can avoid the alignment problem by insisting that for
widths greater than one, the section(s) being converted must already be 
aligned and padded to multiples of that width.

Hence, please could you try out the third iteration of the patch, which
should generate the correct address for widths > 1, and which will refuse
to convert unaligned sections.  I was not sure what to do about sections
which are not padded to the required alignment, so I have left the code
in its current form.  The code will just stop generating bytes when it
reaches the end of the section being converted and it will not generate
any padding bytes.
Comment 18 Nick Clifton 2022-11-21 12:13:03 UTC
Created attachment 14470 [details]
Proposed patch
Comment 19 Gökçe Aydos 2022-11-24 11:20:14 UTC
Thanks 🎈

The address generation for data-width>1 looks fine in general. I found the following issues:

1) Output is wrong if data-width>16. Test cases:

data-width=1:

```
@80000000
B7 C0 ED FE 93 80 D0 EA 37 E1 76 FF 13 01 61 F5
B7 71 BB 7F 93 81 B1 FA 37 B2 DD BF 13 02 52 7D
```

data-width=16:

```
@08000000
F5610113FF76E137EAD08093FEEDC0B7
7D520213BFDDB237FAB181937FBB71B7
```
16 bytes correspond to 4 right shifts, so `08000000` is ok.

data-width=32

```
@04000000
F5610113FF76E137EAD08093FEEDC0B7
7D520213BFDDB237FAB181937FBB71B7
```

Address correct, but not the data :( This is an edge case, but we should handle that. Probably the same behavior for data-widths greater than 32.


2) Error message not helpful

```
$ ... --verilog-data-width=3 
riscv64-elf-objcopy: my.memh: invalid operation
```

The error message could be more informative by including that something is wrong with `--verilog-data-width`.

---

I could not test unaligned data because I could not figure out how to generate and elf file with unaligned words using gcc and ld :( I fuzzed with `.align 2` in assembler and `. = ALIGN(2)` but the output was always 32 bit aligned. If someone has can send an example assembler and linker code, I can happily test that.
Comment 20 Nick Clifton 2022-11-28 12:00:08 UTC
(In reply to Gökçe Aydos from comment #19)
Hi Gökçe,
 
> 1) Output is wrong if data-width>16. 

Does the verilog memory file format actually support widths greater than 16 ?
(I have never used it, so maybe it does).  And if it does, is it just theoretical, or is it actually used in real life ?

 
> data-width=32
> 
> ```
> @04000000
> F5610113FF76E137EAD08093FEEDC0B7
> 7D520213BFDDB237FAB181937FBB71B7

Yes - for reasons I have not been able to track down the verilog data is
handled inside the BFD library in 16-byte packets.  Unless it is really
necessary to support larger widths, I would prefer to leave the code as 
it is and just add a warning for larger values.


 
> 2) Error message not helpful

> $ ... --verilog-data-width=3 
> riscv64-elf-objcopy: my.memh: invalid operation

Fair enough - that should be a simple fix.

Cheers
  Nick
Comment 21 Olof Kindgren 2022-11-28 12:18:13 UTC
Thanks for keeping working on this. I'm afraid I haven't had any time to test it myself but it looks like we're on the right track. Yes, the data can be as wide as it needs to be. There's probably some upper limit in the LRM but for practical reasons I don't expect to hit that. I do see however that there could be a use for e.g. 256-bit (data width = 32) wide memories as internal buses can be that large and even 1024 bits and beyond on occasion.

I don't think that's a common case though, so I would personally be happy to print a warning for widths > 16 and if someone is enough motivated to fix that, they can do so separately. I'm reasonably sure that the vast majority will use 4 och 8 byte widths.
Comment 22 Gökçe Aydos 2022-11-30 07:50:06 UTC
I browsed LRM section 21.4 about `$readmemh`, but could not find any info about maximum width (for a memory line that is read by `$readmemh`). I think the range is limited by the maximum width of a vector (`logic`, `bit` etc). Section 6.9.1 states:

> Implementations may set a limit on the maximum length of a vector,
> but the limit shall be at least 65536 (2^16) bits.

I tried to read 1024 bits with Verilator and Vivado. Both tries succeeded. Practically I know that 1024 bit memories exist, e.g., [HBM](https://en.wikipedia.org/wiki/High_Bandwidth_Memory#HBM3). Nevertheless I concur what Olof said: people will typically use 32 to 128 bits and when people will start using objdump, then someone will fix the max 16 byte width.

FWIW here are the files I used for testing:

a.sv:
```
module a;
    
logic[1023:0] mem [int];
logic clk = 0;

initial begin
    $readmemh("a.memh", mem);
    foreach (mem[i])
        $display("mem[%0d] = %0d", i, mem[i]);
    $finish;
end

always #1 clk ^= 1;

endmodule
```

a.memh
```
01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde01234567890abcde
```
Comment 23 Nick Clifton 2022-11-30 11:25:33 UTC
Created attachment 14475 [details]
Proposed patch

Right, here is version 4 of the patch.  This is the same as version 3, except that now if you attempt to set the --verilog-data-width option to any value other than 1,2,4,8 or 16 you will get an informative error message and the objcopy will fail.
Comment 24 Sourceware Commits 2022-12-01 13:10:17 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=6ef35c04dffe685ece08212201c4c032baf8aa86

commit 6ef35c04dffe685ece08212201c4c032baf8aa86
Author: Nick Clifton <nickc@redhat.com>
Date:   Thu Dec 1 13:09:26 2022 +0000

    Fix verilog output when the width is > 1.
    
            PR 25202
    bfd     * bfd.c (VerilogDataEndianness): New variable.
            (verilog_write_record): Use VerilogDataEndianness, if set, to
            choose the endianness of the output.
            (verilog_write_section): Adjust the address by the data width.
    
    binutils* objcopy.c (copy_object): Set VerilogDataEndianness to the
            endianness of the input file.
            (copy_main): Verifiy the value set by the --verilog-data-width
            option.
            * testsuite/binutils-all/objcopy.exp: Add tests of the new behaviour.
            * testsuite/binutils-all/verilog-I4.hex: New file.
Comment 25 Alan Modra 2024-02-27 22:25:08 UTC
Fixed, presumably.