Bug 24546

Summary:	x86-64 far jump/call encoding issues
Product:	binutils	Reporter:	Andrew Cooper <andrew.cooper3>
Component:	gas	Assignee:	Not yet assigned to anyone <unassigned>
Status:	RESOLVED FIXED
Severity:	normal	CC:	hjl.tools, jbeulich
Priority:	P2
Version:	2.32
Target Milestone:	2.35
Host:		Target:
Build:		Last reconfirmed:	2019-05-14 00:00:00

Description Andrew Cooper 2019-05-11 14:46:52 UTC

I have some problems when trying to encode the 64bit forms of far call/jump.  Like all other far operations in 64bit, lcall/ljmp default to a 32bit operand, and require a rex64 to promote the instruction to having a 64bit operand (specifically, it changes these instructions between having a 6 byte operand and a 10 byte operand).

$ cat far-jmps.S
        .code64
code64:
        lcalll *(%rsp)
        rex64 lcall *(%rsp)

        ljmpl *(%rsp)
        rex64 ljmp *(%rsp)

        lretl
        lretq

This version of the file assembles correct:

$ /local/bin/bin-2.32/bin/as far-jmps.S -o far-jmps.o
$

However, when substituting the rex64 prefix for an l or q suffix, assembly fails with:

$ /local/bin/bin-2.32/bin/as far-jmps.S -o far-jmps.o
far-jmps.S: Assembler messages:
far-jmps.S:4: Error: invalid instruction suffix for `lcall'
far-jmps.S:7: Error: invalid instruction suffix for `ljmp'

Furthermore (or possibly relatedly), objdump doesn't disassemble the instruction in an expected manner:

$ /local/bin/bin-2.32/bin/objdump -d far-jmps.o

far-jmps.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <code64>:
   0:    ff 1c 24                 lcall  *(%rsp)
   3:    48 ff 1c 24              rex.W lcall *(%rsp)
   7:    ff 2c 24                 ljmp   *(%rsp)
   a:    48 ff 2c 24              rex.W ljmp *(%rsp)
   e:    cb                       lret  
   f:    48 cb                    lretq

The rex.W prefix printed is an accurate representation of the encoding, but an l or q suffix would be the consistent way of rendering the instructions.

Observe that throughout all of this, lret and lretq do assemble and disassemble in the expected manner (as do iretl and iretq which I omitted from the example.)

Comment 1 H.J. Lu 2019-05-14 18:53:16 UTC

Can far call/jmp take m16:m64 on AMD processors?

Comment 2 Andrew Cooper 2019-05-14 19:03:20 UTC

Yes.  That works on all 64bit capable processors.

The encoding which isn't supported in 64bit mode is the ptr:16:{16,32,64} encoding, which stores the segment and offset as immediate data operands in the instruction.

~Andrew

Comment 3 H.J. Lu 2019-05-14 19:22:20 UTC

(In reply to Andrew Cooper from comment #2)
> Yes.  That works on all 64bit capable processors.
> 
> The encoding which isn't supported in 64bit mode is the ptr:16:{16,32,64}
> encoding, which stores the segment and offset as immediate data operands in
> the instruction.
> 
> ~Andrew

AMD64 manual says:

CALL FAR pntr16:16 9A cd Far call direct, with the target specified by a far pointer
contained in the instruction. (Invalid in 64-bit mode.)
CALL FAR pntr16:32 9A cp Far call direct, with the target specified by a far pointer
contained in the instruction. (Invalid in 64-bit mode.)
CALL FAR mem16:16 FF /3 Far call indirect, with the target specified by a far pointer
in memory.
CALL FAR mem16:32 FF /3 Far call indirect, with the target specified by a far pointer
in memory.

Comment 4 Andrew Cooper 2019-05-15 18:09:12 UTC

(In reply to H.J. Lu from comment #3)
> AMD64 manual says:
> 
> CALL FAR pntr16:16 9A cd Far call direct, with the target specified by a far
> pointer
> contained in the instruction. (Invalid in 64-bit mode.)
> CALL FAR pntr16:32 9A cp Far call direct, with the target specified by a far
> pointer
> contained in the instruction. (Invalid in 64-bit mode.)
> CALL FAR mem16:16 FF /3 Far call indirect, with the target specified by a
> far pointer
> in memory.
> CALL FAR mem16:32 FF /3 Far call indirect, with the target specified by a
> far pointer
> in memory.

After some experimentation, it turns out that it really is only Intel processors which implement the mem16:64 form.  AMD processors ignore the REX prefix and use mem16:32 form, even when REX-encoded.

Comment 5 H.J. Lu 2019-05-15 19:03:58 UTC

(In reply to Andrew Cooper from comment #4)
> 
> After some experimentation, it turns out that it really is only Intel
> processors which implement the mem16:64 form.  AMD processors ignore the REX
> prefix and use mem16:32 form, even when REX-encoded.

That is the reason for the current behavior.

Comment 6 Jan Beulich 2019-05-16 11:20:46 UTC

(In reply to H.J. Lu from comment #5)
> That is the reason for the current behavior.

But you've introduced the Intel64 and AMD64 attributes, which could be used here as well to distinguish the behavior. Also for LFS, LGS, and LSS then.

Similarly conditional branches (including LOOP etc) would want handling to match that of near CALL/JMP afaict.

Comment 7 H.J. Lu 2019-05-16 15:27:39 UTC

(In reply to Jan Beulich from comment #6)
> (In reply to H.J. Lu from comment #5)
> > That is the reason for the current behavior.
> 
> But you've introduced the Intel64 and AMD64 attributes, which could be used
> here as well to distinguish the behavior. Also for LFS, LGS, and LSS then.
> 
> Similarly conditional branches (including LOOP etc) would want handling to
> match that of near CALL/JMP afaict.

Care to make a patch?

Comment 8 Jan Beulich 2019-05-16 15:52:08 UTC

(In reply to H.J. Lu from comment #7)
> Care to make a patch?

Well, I've added it to my list of things to look into, but there are various other things higher up that list, so it's not clear at all when I'd fine time.

Plus Andrew's and my own observations on the actual behavior appear to disagree, making us suspect for the moment that there might even be model specific behavior (beyond the vendor differences) here. I wouldn't want to pin down one variant in binutils when there potentially are others as well.

Comment 9 Jan Beulich 2020-01-09 11:00:46 UTC

The proposed added (AT&T mode) behavior is to allow lcall and ljmp to also have q suffixes in 64-bit Intel mode, paralleling how other insns (including branching ones) work. Similarly the assembler would then print q suffixes in Intel64 mode, rather than rex.W.

A patch for this was already posted as part of a larger series:
https://sourceware.org/ml/binutils/2019-12/msg00355.html

Comment 10 Jan Beulich 2020-02-12 15:26:27 UTC

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=5990e377e5a339bce715fabfc3e45b24b459a7af

I don't see a mechanism in the web interface though how to change a bug's status, so I'm leaving it at "NEW".

Comment 11 H.J. Lu 2020-02-12 15:46:36 UTC

Fixed for 2.35.