Bug 27217 - aarch64 as Internal error in md_apply_fix at ....../gas/config/tc-aarch64.c:8330.
Summary: aarch64 as Internal error in md_apply_fix at ....../gas/config/tc-aarch64.c:8...
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: 2.35
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-20 16:06 UTC by Joel Sherrill
Modified: 2024-01-10 12:19 UTC (History)
5 users (show)

See Also:
Host:
Target: aarch64-rtems aarch64-elf
Build:
Last reconfirmed: 2021-03-29 00:00:00


Attachments
Proposed patch (418 bytes, patch)
2021-03-26 15:56 UTC, Nick Clifton
Details | Diff
Proposed patch (595 bytes, patch)
2021-03-29 14:29 UTC, Nick Clifton
Details | Diff
Proposed patch (4.65 KB, patch)
2021-04-01 11:32 UTC, Nick Clifton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Joel Sherrill 2021-01-20 16:06:33 UTC
The following C code compiles and assembles cleanly on the other 17 architectures supported by RTEMS. But when compiled at -O2, -O1, and -Os on aarch64, gcc produces assembly which gives an internal error. Examining the generated assembly on arm and aarch64, enabling optimization results in the .set statement being in the top part of the generated assembly versus the bottom part. This pattern of .set placement is consistent across at least arm and aarch64.

GCC's behavior related to the start and end of user assembly language comments also varies based on optimization level. This isn't critical but odd. This type of comment isn't generated for the arm. 

===========================
extern char bar[];
char * foo(void)
{
        return bar;
}
__asm__ (".set bar, 0");
===========================

$ /home/joel/rtems-work/tools/6/bin/aarch64-rtems6-gcc -c -O2 set.c
/tmp/ccyseDVm.s: Assembler messages:
/tmp/ccyseDVm.s: Internal error in md_apply_fix at ../../sourceware-mirror-binutils-gdb-8807d31/gas/config/tc-aarch64.c:8330.
Please report this bug.

This is the generated assembly so you do not need a full aarch64 toolchain.

===================================
	.arch armv8-a
	.file	"set.c"
	.text
	// Start of user assembly
	.set bar, 0
	// End of user assembly
	.align	2
	.p2align 4,,11
	.global	foo
	.type	foo, %function
foo:
	adrp	x0, bar
	add	x0, x0, :lo12:bar
	ret
	.size	foo, .-foo
	.ident	"GCC: (GNU) 10.2.1 20201030 (RTEMS 6, RSB 31dd1ab4424e59e48c60dfa587e805e908d13b02, Newlib 9517e5f)"

===================================

This is the assembly language at -O0 and it assembles successfully:

===================================


	.arch armv8-a
	.file	"set.c"
	.text
	.align	2
	.global	foo
	.type	foo, %function
foo:
	adrp	x0, bar
	add	x0, x0, :lo12:bar
	ret
	.size	foo, .-foo
	// Start of user assembly
	.set bar, 0
	.ident	"GCC: (GNU) 10.2.1 20201030 (RTEMS 6, RSB 31dd1ab4424e59e48c60dfa587e805e908d13b02, Newlib 9517e5f)"

===================================

I've confirmed this in the current master as well as an older version I had. Based on reports from other RTEMS developers, this has been broken a while. I personally confirmed it in these:

GNU assembler (GNU Binutils) 2.36.50.20210120
GNU assembler (GNU Binutils) 2.35.50.20201102
Comment 1 Alex Coplan 2021-03-12 14:47:15 UTC
Dup.

*** This bug has been marked as a duplicate of bug 26395 ***
Comment 2 Nick Clifton 2021-03-26 15:56:36 UTC
Created attachment 13329 [details]
Proposed patch

Hi Joel,

  Please could you try out this patch and let me know if it works for you ?

Cheers
  Nick
Comment 3 Joel Sherrill 2021-03-26 19:43:35 UTC
(In reply to Nick Clifton from comment #2)
> Created attachment 13329 [details]
> Proposed patch
> 
> Hi Joel,
> 
>   Please could you try out this patch and let me know if it works for you ?
> 
> Cheers
>   Nick

It changes the failure to one I've never see before. :)

/home/opticron/rtems-development/tools/lib/gcc/aarch64-rtems6/10.2.1/../../../../aarch64-rtems6/bin/ld: testsuites/sptests/spconfig01/init.c.568.o: in function `test_stack_config':
/home/opticron/rtems-development/rtems/build/aarch64/xilinx_zynqmp_lp64_qemu/../../../testsuites/sptests/spconfig01/init.c:95: undefined reference to `no symbol'
collect2: error: ld returned 1 exit status

This PR is marked as resolved/duplicate but is that right?
Comment 4 Nick Clifton 2021-03-29 14:29:21 UTC
(In reply to Joel Sherrill from comment #3)
Hi Joel,

> It changes the failure to one I've never see before. :)
> 
> /home/opticron/rtems-development/tools/lib/gcc/aarch64-rtems6/10.2.1/../../..
> /../aarch64-rtems6/bin/ld: testsuites/sptests/spconfig01/init.c.568.o: in
> function `test_stack_config':
> /home/opticron/rtems-development/rtems/build/aarch64/xilinx_zynqmp_lp64_qemu/
> ../../../testsuites/sptests/spconfig01/init.c:95: undefined reference to `no
> symbol'

Ah- that should not happen.   Your test case shows it too, although I only ever assembled it, I did not try to link it.  The reloc is for the ADRP instruction's reference to the zero'ed  "bar" symbol.

I am not sure of the best thing to do here.  I am going to upload a patch that stops the relocation from being generated, but that might not be the right thing to do.  Since the symbol has been, effectively, deleted, maybe generating a linker error is correct.

> This PR is marked as resolved/duplicate but is that right?

Nope - that was a snafu.  I am reopening the PR.

Cheers
  Nick
Comment 5 Nick Clifton 2021-03-29 14:29:41 UTC
Created attachment 13335 [details]
Proposed patch
Comment 6 Kinsey Moore 2021-03-29 22:03:00 UTC
I've tested the new patch and, while everything now compiles and links, there seems to be some weird behavior with global symbols. This bug was previously affecting two tests in RTEMS under AArch64 and both tests now exhibit global symbols not yielding their assigned values. The remainder of the tests still pass appropriately.
Comment 7 Nick Clifton 2021-03-30 13:55:57 UTC
(In reply to Kinsey Moore from comment #6)
> I've tested the new patch and, while everything now compiles and links,
> there seems to be some weird behavior with global symbols.

Yeah - this is the kind of thing that I was worried about.

> This bug was
> previously affecting two tests in RTEMS under AArch64 and both tests now
> exhibit global symbols not yielding their assigned values. The remainder of
> the tests still pass appropriately.

Am I correct in thinking that these assigned values are non-zero ?

One effect of removing the relocations entirely is that the effected 
symbol's value is always treated as 0...

Do you have a test case that I can look at to see if I can improve the
patch ?
Comment 8 Kinsey Moore 2021-03-30 14:14:21 UTC
(In reply to Nick Clifton from comment #7)
> (In reply to Kinsey Moore from comment #6)
> > I've tested the new patch and, while everything now compiles and links,
> > there seems to be some weird behavior with global symbols.
> 
> Yeah - this is the kind of thing that I was worried about.
> 
> > This bug was
> > previously affecting two tests in RTEMS under AArch64 and both tests now
> > exhibit global symbols not yielding their assigned values. The remainder of
> > the tests still pass appropriately.
> 
> Am I correct in thinking that these assigned values are non-zero ?

You are correct, the assigned values are non-zero in both cases.

> One effect of removing the relocations entirely is that the effected 
> symbol's value is always treated as 0...
> 
> Do you have a test case that I can look at to see if I can improve the
> patch ?

I'll see if I can come up with a reduced test case. Do you prefer an executable output that runs on QEMU or just something that compiles to an object file to pick apart?
Comment 9 Nick Clifton 2021-03-30 15:53:17 UTC
(In reply to Kinsey Moore from comment #8)
 
> I'll see if I can come up with a reduced test case. Do you prefer an
> executable output that runs on QEMU or just something that compiles to an
> object file to pick apart?

Preferably an assembler source file.  Then I can check that the object file either contains the necessary relocations, or else that the correct values have been stored in the constant part of the relevant instructions.
Comment 10 Joel Sherrill 2021-03-30 16:21:16 UTC
Nick. Go back to the first comment when I posted this. I put the C code along with assembly from two different optimization levels.
Comment 11 Nick Clifton 2021-03-30 16:30:05 UTC
(In reply to Joel Sherrill from comment #10)
> Nick. Go back to the first comment when I posted this. I put the C code
> along with assembly from two different optimization levels.

Ah - and change the ".set bar,0" to something else, like ".set bar,1".  Right ?
Comment 12 Kinsey Moore 2021-03-30 17:44:46 UTC
(In reply to Nick Clifton from comment #11)
> Ah - and change the ".set bar,0" to something else, like ".set bar,1". 
> Right ?

From the two tests, one was 0x2800 and one was 0xabc, but I think 1 there should suffice just as well.
Comment 13 Nick Clifton 2021-04-01 11:32:15 UTC
Created attachment 13341 [details]
Proposed patch

Hi Guys,

  OK, please try out this third patch.  (It should be used on its own - it supplants the other two).

  This patch tries to stop the assembler's expression evaluator from converting the references to the "bar" symbol into references to a constant value.  Instead the symbol is preserved intact in the object file, and the relocations are resolved by the linker as normal.

  I *think* that this is the right way to resolve the problem, since looking at the generated assembler code, and C source, it appears that the intention is provide a fixed value for the "bar" symbol, but not to eliminate it completely.

Cheers
  Nick
Comment 14 Kinsey Moore 2021-04-01 14:33:23 UTC
I had to tweak the patch a little to compile (no bool) and apply (probably because I'm working off of a slightly older revision), but everything seems to be working as expected with this latest patch. Thanks for diving into this!
Comment 15 Sourceware Commits 2021-04-06 12:29:17 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=eac4eb8ecb2626ef7711d8f6bee9e870ae435604

commit eac4eb8ecb2626ef7711d8f6bee9e870ae435604
Author: Nick Clifton <nickc@redhat.com>
Date:   Tue Apr 6 13:27:50 2021 +0100

    Fix a problem assembling AArch64 sources when a relocation is generated against a symbol that has a defined value.
    
            PR 27217
            * config/tc-aarch64.c (my_get_expression): Rename to
            aarch64_get_expression.  Add a fifth argument to enable deferring
            of expression resolution.
            (parse_typed_reg): Update calls to my_get_expression.
            (parse_vector_reg_list): Likewise.
            (parse_immediate_expression): Likewise.
            (parse_big_immediate): Likewise.
            (parse_shift): Likewise.
            (parse_shifter_operand_imm): Likewise.
            (parse_operands): Likewise.
            (parse_shifter_operand_reloc): Update calls to my_get_expression
            and call aarch64_force_reloc to determine the value of the new
            fifth argument.
            (parse_address_main): Likewise.
            (parse_half): Likewise.
            (parse_adrp): Likewise.
            (aarch64_force_reloc): New function.  Contains code extracted from...
            (aarch64_force_relocation): ... here.
            * testsuite/gas/aarch64/pr27217.s: New test case.
            * testsuite/gas/aarch64/pr27217.d: New test driver.
Comment 16 Nick Clifton 2021-04-06 12:30:28 UTC
OK, patch applied.
Comment 17 Sourceware Commits 2021-04-07 08:53:28 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=b12389f219facfb4aa29b97fdcbc2664a5b0732a

commit b12389f219facfb4aa29b97fdcbc2664a5b0732a
Author: Alan Modra <amodra@gmail.com>
Date:   Wed Apr 7 18:12:38 2021 +0930

    Fix pr27217 testcase failure
    
    aarch64_be-linux-gnu_ilp32  +FAIL: PR27212
    
            PR 27217
            * testsuite/gas/aarch64/pr27217.d: Correct name.  Accept ilp32 relocs.
Comment 18 Sourceware Commits 2022-05-18 15:56:01 UTC
The master branch has been updated by Jan Beulich <jbeulich@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=04dfe7aa52171d110db813bce67c0eea5f4b18cd

commit 04dfe7aa52171d110db813bce67c0eea5f4b18cd
Author: Jan Beulich <jbeulich@suse.com>
Date:   Wed May 18 17:55:55 2022 +0200

    Arm64: follow-on to PR gas/27217 fix
    
    PR gas/27217
    
    Prior to trying to address PR gas/28888 I noticed anomalies in how
    certain insns would / wouldn't be affected in similar ways.
    
    Commit eac4eb8ecb26 ("Fix a problem assembling AArch64 sources when a
    relocation is generated against a symbol that has a defined value") had
    two copy-and-paste mistakes, passing the wrong type to
    aarch64_force_reloc().
    
    It further failed to add placeholder relocation types to that function's
    block of case labels leading to a return of 1. While not of interest for
    aarch64_force_relocation() (these placeholders are resolved right in
    parse_operands()), calls to aarch64_force_reloc() happen before that
    resolution would take place.
Comment 19 Sourceware Commits 2022-05-19 10:46:57 UTC
The master branch has been updated by Jan Beulich <jbeulich@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=15b7af6c874610d802b64e1778202b7653d5fa08

commit 15b7af6c874610d802b64e1778202b7653d5fa08
Author: Jan Beulich <jbeulich@suse.com>
Date:   Thu May 19 12:46:21 2022 +0200

    Arm64: force emission of ILP32-dependent relocs
    
    Like the placeholder types added in 04dfe7aa5217 ("Arm64: follow-on to
    PR gas/27217 fix"), these are also placeholders which are subsequently
    resolved (albeit later, hence this being a separate issue). As for the
    resolved types 1 is returned, these pseudo-relocs should also have 1
    returned to force retaining of the [eventual] relocations. This is also
    spelled out individually for each of them in md_apply_fix().
Comment 20 Sourceware Commits 2022-07-29 07:27:35 UTC
The master branch has been updated by Jan Beulich <jbeulich@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=c1723a8118f0d02b01a061095f48e75264b2ca4f

commit c1723a8118f0d02b01a061095f48e75264b2ca4f
Author: Jan Beulich <jbeulich@suse.com>
Date:   Fri Jul 29 09:26:47 2022 +0200

    Arm64: re-work PR gas/27217 fix
    
    The original approach has resulted in anomalies when . is involved in an
    operand of one of the affected insns. We cannot leave . unresolved, or
    else it'll be resolved at the end of assembly, then pointing to the
    address of a section rather than at the insn of interest. Undo part of
    the original change and instead check whether a relocation cannot be
    omitted in md_apply_fix().
    
    By resolving the expressions again, equates (see the adjustment of the
    respective testcase) will now be evaluated, and hence relocations
    against absolute addresses be emitted. This ought to be okay as long as
    the equates aren't global (and hence can't be overridden). If a need
    for such arises, quite likely the only way to address this would be to
    invent yet another expression evaluation mode, leaving everything
    _except_ . un-evaluated.
    
    There's a further anomaly in how transitive equates are handled. In
    
            .set x, 0x12345678
            .eqv bar, x
    foo:
            adrp    x0, x
            add     x0, x0, :lo12:x
    
            adrp    x0, bar
            add     x0, x0, :lo12:bar
    
    the first two relocations are now against *ABS*:0x12345678 (as said
    above), whereas the latter two relocations would be against x. (Before
    the change here, the first two relocations are against x and the latter
    two against bar.) But this is an issue seen elsewhere as well, and would
    likely require adjustments in the target-independent parts of the
    assembler instead of trying to hack around this for every target.
Comment 21 Kinsey Moore 2023-03-21 02:55:48 UTC
This has broken again in 2.40, presumably due to the rework that occurred.
Comment 22 Kinsey Moore 2023-03-21 02:58:20 UTC
To be clear, the "no symbol" error has returned.
Comment 23 Kinsey Moore 2023-03-21 14:41:56 UTC
I have verified that commit c1723a8118f0d02b01a061095f48e75264b2ca4f is the cause of the regression.
Comment 24 Jan Beulich 2023-03-22 08:44:41 UTC
(In reply to Kinsey Moore from comment #22)
> To be clear, the "no symbol" error has returned.

Since the testcase that was originally added doesn't cover this case (which is why it wasn't noticed), can you please supply an example code fragment where the bad behavior is observed?
Comment 25 Kinsey Moore 2023-03-22 12:21:16 UTC
The original test case should show it provided that you also attempt to link it as per Nick's comment: https://sourceware.org/bugzilla/show_bug.cgi?id=27217#c4
Comment 26 Jan Beulich 2023-03-22 14:35:37 UTC
(In reply to Kinsey Moore from comment #25)
> The original test case should show it provided that you also attempt to link
> it as per Nick's comment:
> https://sourceware.org/bugzilla/show_bug.cgi?id=27217#c4

Oh, I'm sorry for not paying attention. I can indeed observe the issue there.

Quoting from the description of r_info in the ELF spec: "If the index is STN_UNDEF, the undefined symbol index, the relocation uses 0 as the ``symbol value''." Which makes me think we're dealing with a linker issue here, as this is precisely the situation we're in. I'll see to find time to go hunt, but I'm far less familiar with ld than with gas. (If others agree this is a separate issue, I guess this would better be handled in a fresh bug report.)
Comment 27 Jan Beulich 2023-03-22 14:45:11 UTC
Another question is: Can't we suppress emitting of relocations when the value is absolute? (Of course really the relocation in the testcase should reference "bar", but as we've seen arranging for that by simply avoiding to evaluate expressions produces other fallout. In the corresponding email conversation I did suggest yet another expression evaluation mode as a possible route. But that would look more like another hack than a solution, to me at least.)
Comment 28 Jan Beulich 2023-03-23 15:25:38 UTC
(In reply to Jan Beulich from comment #26)
> Quoting from the description of r_info in the ELF spec: "If the index is
> STN_UNDEF, the undefined symbol index, the relocation uses 0 as the ``symbol
> value''." Which makes me think we're dealing with a linker issue here, as
> this is precisely the situation we're in. I'll see to find time to go hunt,
> but I'm far less familiar with ld than with gas. (If others agree this is a
> separate issue, I guess this would better be handled in a fresh bug report.)

See https://sourceware.org/pipermail/binutils/2023-March/126759.html
Comment 29 Kinsey Moore 2023-03-23 15:34:59 UTC
Ah, great! Thanks so much for looking into this.
Comment 30 Pete Moore 2023-12-22 08:59:37 UTC
I have a simple test case which I think is consistent with this bug:

```
$ cat adrp.s 
.global _start
_start: adrp x0, 0x8000
```

After assembling and linking:

```
$ aarch64-none-elf-as -o adrp.o adrp.s 
$ aarch64-none-elf-ld -Ttext=0x0 -o adrp.elf adrp.o
aarch64-none-elf-ld: adrp.o: in function `_start':
(.text+0x0): undefined reference to `no symbol'
```

Note, the following trick works (if e.g. _start is known to be 0):

```
$ cat adrp.s 
.global _start
_start: adrp x0, 0x8000 + _start
```

which assembles as desired:

```
$ aarch64-none-elf-objdump -d adrp.elf

adrp.elf:     file format elf64-littleaarch64


Disassembly of section .text:

0000000000000000 <_start>:
   0:	90000040 	adrp	x0, 8000 <_start+0x8000>
```

However, this only works if _start is fixed and known. Being able to specify an absolute (rather than relative) value for the immediate of the adrp instruction is useful for e.g. referencing peripherals at fixed locations which are known to be in the +/- 4GB range of the current address. My particular use case is referencing Raspberry Pi peripheral base addresses.

Let me know if I should raise a separate bug, or if the above test case is is consistent with the current bug. Many thanks!

```
$ aarch64-none-elf-ld --version
GNU ld (GNU Binutils) 2.39
```
Comment 31 Nick Clifton 2024-01-10 12:19:14 UTC
The bug is fixed in the 2.41 release.