Bug 31343

Summary: MIPS: correct behavior of branch to an imm?
Product: binutils Reporter: YunQiang Su <syq>
Component: gasAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: macro, nickc
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description YunQiang Su 2024-02-06 08:58:30 UTC
Some code like:
      b   (0)
generates the binary like:

00000000 <.text>:
   0: 1000ffff b 0x0
         0: R_MIPS_PC16 *ABS*
   4: 00000000 nop

It will cause something wrong during runtime, normally, 
jump to an address like 0xABCD0000.
https://github.com/llvm/llvm-project/issues/67951

Should we just emit an error for the asm code like this, or
emit binary without relocations?

How should we treat the IMM: may be how many bytes?
Comment 1 Maciej W. Rozycki 2024-02-06 15:32:20 UTC
The handling of R_MIPS_PC16 is correctly specified in the MIPS psABI
to overflow at link time where applicable, so if the linker finds the
value calculated for the symbol referred (whether it's absolute or not
does not matter) not to fit in the field relocated, it is supposed to
report it as with other such issues, which BFD correctly does (taking
a less trivial example):

$ cat b.s
	.text
	.globl	foo
	.ent	foo
foo:
	b	0x1234
	.end	foo
$ as -32 -o b.o b.s
$ ld -Ttext=0x80000000 -efoo -melf32btsmip -o b b.o
b.o: in function `foo':
(.text+0x0): relocation truncated to fit: R_MIPS_PC16 against `*UND*'
$ 

Use -Werror, as any sane project should do, to catch such issues.

There is an issue however with absolute symbol processing (and possibly
with any addend processing) with REL relocations here:

$ as -32 -o b-32.o b.s
$ ld -Ttext=0 -efoo -melf32btsmip -o b-32 b-32.o
$ as -n32 -o b-n32.o b.s
$ ld -Ttext=0 -efoo -melf32btsmipn32 -o b-n32 b-n32.o
$ objdump -dr b-32.o b-n32.o

b-32.o:     file format elf32-tradbigmips


Disassembly of section .text:

00000000 <foo>:
   0:	10000919 	b	2468 <foo+0x2468>
			0: R_MIPS_PC16	*ABS*
   4:	00000000 	nop
	...

b-n32.o:     file format elf32-ntradbigmips


Disassembly of section .text:

00000000 <foo>:
   0:	1000048d 	b	1238 <foo+0x1238>
			0: R_MIPS_PC16	*ABS*+0x1230
   4:	00000000 	nop
	...
$ objdump -d b-32 b-n32

b-32:     file format elf32-tradbigmips


Disassembly of section .text:

00000000 <foo>:
   0:	10000919 	b	2468 <foo+0x2468>
   4:	00000000 	nop
	...

b-n32:     file format elf32-ntradbigmips


Disassembly of section .text:

00000000 <foo>:
   0:	1000048c 	b	1234 <foo+0x1234>
   4:	00000000 	nop
	...
$ 

Notice how the absolute value is not correctly referred in the o32/REL
variant, due to how it has been incorrectly encoded for the in-place
case.

Of course for PIC/PIE links this calculation is supposed to always fail
for absolute symbols, because there is no corresponding dynamic
relocation defined in the MIPS psABI to fulfil the purpose of the
calculation at load time.  However it does not happen right now, which is
a bug in LD.

$ ld -shared -efoo -melf32btsmip -o b.so b.o
$ readelf -r b.so
There are no relocations in this file.
$ 

In the absence of a dynamic relocation (which for text is a no-no anyway)
the resulting DSO will only correspond to the original source code if it
has been loaded such that `foo' has the run-time value of 0, which of
course cannot be guaranteed.  Therefore this link is expected to fail.

So it appears we have two issues in absolute symbol processing with
R_MIPS_PC16 relocations, neither of which affects your case though.

NB I find it odd for someone to actually use an absolute value with a
branch in their project, though technically it's of course defined if a
bit unusual a case in terms of the MIPS psABI.  The bugs observed only
confirm it's an odd corner case hardly anyone cares about.
Comment 2 YunQiang Su 2024-02-06 16:29:06 UTC
Thank you so much. I understand it now: It is used to branch to an absolute address with branch instructions.

So it should never be used for PIC/PIE code? (if we want to add a more dynamic  relocation)

I think that we can emit an error if it is used for PIC code.

For LLVM, let's disable this feature for current: the behavior of LLVM is different with gas.
Comment 3 YunQiang Su 2024-02-06 16:33:52 UTC
(In reply to YunQiang Su from comment #2)
> Thank you so much. I understand it now: It is used to branch to an absolute
> address with branch instructions.
> 
> So it should never be used for PIC/PIE code? (if we want to add a more
> dynamic  relocation)
> 

Sorry, typo: if we do *not* want ...

> I think that we can emit an error if it is used for PIC code.
> 
> For LLVM, let's disable this feature for current: the behavior of LLVM is
> different with gas.
Comment 4 Maciej W. Rozycki 2024-02-06 19:00:28 UTC
Yes, it has to be a warning or error for PIC/PIE (whatever the BFD policy
is here; I don't remember offhand).