This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: x86 issues


Hi Jan,

Thanks for your patches. I will check them out.

I am planning to clean up x86 assembler myself. Linux x86 assembler
supports both Intel and AT&T syntax. One difference is

   * In AT&T syntax the size of memory operands is determined from the
     last character of the instruction mnemonic.  Mnemonic suffixes of
     `b', `w', `l' and `q' specify byte (8-bit), word (16-bit), long
     (32-bit) and quadruple word (64-bit) memory references.  Intel
     syntax accomplishes this by prefixing memory operands (_not_ the
     instruction mnemonics) with `byte ptr', `word ptr', `dword ptr'
     and `qword ptr'.  Thus, Intel `mov al, byte ptr FOO' is `movb FOO,
     %al' in AT&T syntax.

Linux assembler was originally written for AT&T syntax and Intel
syntax support was added much later. There are some issues:

1.	Assembler wasn't used to store operand size information for each
operand. I have changed the assembler infrastructure to add operand
size support. In Intel mode, we now have all operand sizes. In AT&T
mode, the memory operand has no size information.
2.	Assembler uses mnemonic suffix for both memory operand size and
instruction encoding. Intel syntax uses mnemonic suffix for encoding
even if it isn't needed for operand size.
3.	Assembler marks an instruction with No_bSuf, No_wSuf, No_lSuf,
No_sSuf, No_qSuf, No_ldSuf, It also servers 2 purposes, one for valid
register/memory operand size and the other for encoding.

We can clean up assembler with:

1.	Don't use mnemonic suffix for both memory operand size and
instruction encoding.
2.	Replace No_bSuf, No_wSuf, No_lSuf, No_sSuf, No_qSuf, No_ldSuf in
instruction template with sizes allowed, like B, W, L, Q, S, LD. It
should only be used in AT&T syntax to match memory operand size.
3.	Add new fields in instruction template for encoding if needed.
4.	Break one template into 2 or more if needed.

Do you have any comments?

Thanks.


H.J.
On Feb 13, 2008 6:25 AM, Jan Beulich <jbeulich@novell.com> wrote:
> As a follow-up to the changes just submitted/committed, here are two
> more things I think aren't working right at present:
>
> - There's no way to specify whether a floating point unit is available. I
>   think that pseudo-ops .arch .{8087,287,387} should be added along
>   with respective constants, so that instructions can be qualified
>   accordingly and the floating point registers (it's really just st which
>   ought to be affected) aren't visible without any of these enabled.
>
> - Currently, set_cpu_arch() infers 64-bit support from the current
>   .codeXX setting. This seems rather odd, since this makes the following
>   sequence behave rather paradoxically:
>         .code64
>         .arch i8086
>         .code64
>   (in that it fails on the second .code64). Instead, 64-bit mode should
>   be a sub-feature like MMX/SSE etc. are, and perhaps entering (for
>   example) 8086 should implicitly switch to 16-bit mode (or otherwise,
>   the CPU change itself should fail).
>
> While I can certainly prepare patches to fix both, I'd first like to
> understand whether the described intended behavior is acceptable.
>
> Jan
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]