[PATCH 3/5] x86: widen @got{,pcrel} support to PUSH and APX IMUL

Jan Beulich jbeulich@suse.com
Thu Feb 6 12:08:39 GMT 2025


On 06.02.2025 03:28, H.J. Lu wrote:
> On Wed, Feb 5, 2025 at 7:37 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 04.02.2025 13:14, H.J. Lu wrote:
>>> On Tue, Feb 4, 2025 at 7:16 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 04.02.2025 12:12, H.J. Lu wrote:
>>>>> On Tue, Feb 4, 2025 at 7:02 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> Thinking of it: With TEST being special-cased in the logic involved, I'm
>>>>>> also curious to learn of a code sequence where TEST would sensibly be
>>>>>> used (and where CMP can't be used instead).
>>>>>
>>>>> Compiler may generate TEST with GOT.
>>>>
>>>> Can it? Why would it? (IOW: Again you didn't really address my request.)
>>>>
>>>
>>> [hjl@gnu-tgl-3 tmp]$ cat x.c
>>> extern int foo __attribute__ ((weak));
>>>
>>> extern void bar (void);
>>>
>>> __attribute__ ((regparm(3)))
>>> void
>>> _start (long int p)
>>> {
>>>   if (((unsigned long) &foo) & p)
>>>     bar ();
>>> }
>>> [hjl@gnu-tgl-3 tmp]$ gcc -c -O2 -m32 x.c -mno-direct-extern-access
>>>
>>> [hjl@gnu-tgl-3 tmp]$ objdump -dwr x.o
>>>
>>> x.o:     file format elf32-i386
>>>
>>>
>>> Disassembly of section .text:
>>>
>>> 00000000 <_start>:
>>>    0: 85 05 00 00 00 00    test   %eax,0x0 2: R_386_GOT32X foo
>>>    6: 75 08                jne    10 <_start+0x10>
>>>    8: c3                    ret
>>>    9: 8d b4 26 00 00 00 00 lea    0x0(%esi,%eiz,1),%esi
>>>   10: e9 fc ff ff ff        jmp    11 <_start+0x11> 11: R_386_PC32 bar
>>> [hjl@gnu-tgl-3 tmp]$
>>>
>>> Which request?
>>
>> "... learn of a code sequence where TEST would sensibly be used" in my
>> earlier reply. So yes, you provided a contrived example now. I'm afraid
>> I can't assign meaning to "if (((unsigned long) &foo) & p)", though.
> 
> It can be used to check pointer alignment.

Hmm, yes, I didn't think of that. Yet then TESTB (and perhaps even TESTW)
would also want to be supported.

>> Instead, slightly adjusting your example, I can see PUSH being used by
>> the compiler in exactly the way we can (but so far didn't) optimize.
>> That's ordinary argument passing, which I'm inclined to call "sensible".
> 
> Please don't add a segment register.

Actually I have to re-raise the why question: We use CS: in some of the
multi-byte NOPs we emit, both for 32- and for 64-bit.

Plus, as said before, I view it as pretty undesirable to split a single
insn into two (NOP of appropriate size followed by shrunk insn).

Alternatively for 32-bit's PUSH we could use an address size override,
albeit this doesn't look very desirable either. For 64-bit we could use
dummy REX prefixes, unless we're dealing with a REX2 encoding. For REX2
forms of PUSH we could either choose to not use REX2, or again use an
address override. An address size override could also be used for
immediate-to-register MOV.

Also: If anything, these prefixes could only gain hint-like meaning
anyway (like two of them once did for conditional jumps), so there would
not be any functional issue. As for their use in multi-byte NOPs, we can
deal with eventual performance issues once we know how bad things are.

Jan


More information about the Binutils mailing list