demand_empty_rest_of_line and ignore_rest_of_line

Tue Apr 27 03:13:00 GMT 2004

Alan Modra <amodra@bigpond.net.au> writes:
...
> 	foo: addr16 mov %eax,0
>
> where the "opcode" part of the instruction itself has whitespace.
> Without various hacks, do_scrub_chars would turn the above into
>
> 	foo: addr16 mov%eax,0
>
> Other architectures have similar assembly constructs.  So I think you
> underestimate the complexity in a general parser design.  Just
> separating an assembly line into label, opcode, operands isn't so easy!

Eh, that doesn't look so bad to me, if the set of opcodes and the set
of prefixes is made available to the generic parser.  The basic
grammar is something like

line: label
    | label? WS (prefix WS)* opcode WS operands
    | WS? directive operands

operands: operand
        | operands WS? ',' WS? operand

where the sets of prefixes, opcodes, and directives are known well in
advance - this might be a good application for perfect hashing.
Operand parsing might have to stay machine-specific, but I hope not,
the commonalities are huge.  And then there are additional little bits
of goo that some architectures let in, like IA64 and its punctuation -
I would handle this by treating those as directives, which is why the
above grammar doesn't separate the '.' from the directive.

Macro handling (I mean .macro, not the built-in pseudo-opcodes that
some architectures have) is awkward because you can override opcodes
with macros.  Not sure what to do about that.  

zw