The following three-line text case assembles incorrectly: .intel_syntax noprefix mov ax, AN_EQU .equ AN_EQU, 5 Compiled with "gcc -c test.S" Disassembled with "objdump -M intel -d test.o": 0: 66 8b 04 25 05 00 00 mov ax,WORD PTR ds:0x5 7: 00 For some reason, this assembled as a memory dereference of ds:0x5. Somehow, gas knew enough to substitute the value 5, but didn't know to treat it as an immediate. If I move the equate before the instruction, it assembles correctly: .intel_syntax noprefix .equ AN_EQU, 5 mov ax, AN_EQU 0: 66 b8 05 00 mov ax,0x5 If I substitute a literal 5 in the instruction, it assembles correctly: .intel_syntax noprefix mov ax, 5 0: 66 b8 05 00 mov ax,0x5 And if I use AT&T syntax, it assembles correctly: .att_syntax mov $AN_EQU, %ax .equ AN_EQU, 5 0: 66 b8 05 00 mov ax,0x5
I originally observed this bug in in 16-bit assembly with .code16, but as shown in this report I can also reproduce it in 64-bit assembly.
When assembler sees mov ax, AN_EQU AN_EQU is undefined and treats it as a symbol. Later AN_EQU is resolved to 5. For mov $AN_EQU, %ax assembler knows the operand is an immediate due to `$'. There is not much assembler can do here.
If the assembler knows enough to go back and substitute the value afterward, it knows that it emitted a data reference when it should have emitted an immediate. It could at least spit out an error at that point rather than assembling incorrect code; that would have saved me an hour of debugging and disassembly. Also, the assembler seems to support forward references in various other places; why not for equates?