Bug 10777 - Macro processing altercation
Summary: Macro processing altercation
Status: RESOLVED WONTFIX
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: 2.19
: P2 enhancement
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-14 13:04 UTC by Konrad Schwarz
Modified: 2010-01-12 09:52 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Konrad Schwarz 2009-10-14 13:04:30 UTC
The state of macro processing in gas is rotten.  I say this after spending an
exorbitant amount of time in getting it to work on problems that other
assemblers, such as from ARM, Tasking, Fujitsu have no problems with.  (This is
not helped by the fact that GAS's listing facility is flaky, but that is a
different matter).

For example, consider the task of using a symbol name that is computed, for
example, setting the symbols .Lvector_used_0, .Lvector_used_1, ...
.Lvector_used_255 to zero.  The solution I have come up with after considerable
experimentation is:

        /* concat required because in gas, .altmacro %
        expansion only works at the start of a word */
        .macro  concat  s1, s2, s3, s4, s5
                .noaltmacro
                \s1\s2\s3\s4\s5
        .endm

        .Li = 256
        .rept   .Li - 0
                .Li = .Li - 1
                .altmacro
                concat  .Lvector_used_, %.Li, < = 0>
        .endr

``Immediate'' expansion of a symbol is possible only in `.altmacro' mode, using
the `%' operator.  On the other hand, `.altmacro' mode uses `<' to delimit
strings, which conflicts with its usual meaning of ``shift left''.  So it is not
a good idea to switch to `.altmacro' mode globally; instead it should be done
only were necessary.  Hence the call to `.altmacro' in the loop and the call to
`.noaltmacro' in the concat macro.

Even in `.altmacro' mode, `%' expansion is only recognized at the beginning of a
word.  Thus, the expression .Lvector_used_%.Li (which corresponds to
vector_used [i] in a C-like language) must be written as separate words.  Hence
the `concat' macro, which allows a separate round of evaluation.

The call to `concat' represents the line
    .Lvector_used\(.Li) = 0
if `\(_expr_)' meant to evaluate `_expr_' immediately.  And in fact, I would
like to recommend this as a way forward.  If `\()'  were extended in this way,
use of GAS would be much more straightforward, as demonstrated by this example.
 The parenthesis serve to clearly delimit the expression; thus, it can be
embedded into the middle of a word.

As another thing, clarification of the way macro arguments are parsed would be
in order.  It seems from the example in (as.info)Macro that arguments can be
enclosed in double quotes and that they are stripped before expansion, but this
is not stated anywhere.

I realize that implementing `\(expr)' is not entirely trivial, so I would like
to request built-in support for M4 as a preprocessor, just like the Unix System
V assembler.  In particular, the System V assembler has a `-m' option which
invokes m4 on its source and assembles the result.  This option is important for
easy integration with `make' default rules: e.g., in GNU make, adding `-Wa,-m'
to ASFLAGS would enable using just the default rule catalog to compile such
sources; otherwise, a separate rule needs to be created and a naming scheme
(such as a suffix) be reserved for these files.

The benefit of M4 is that it is well documented, extremely capable, and offers a
clear separation between the preprocessing and assembly phases of processing,
similar to C.  It is also possible to view the output of preprocessing for
debugging, whereas various errors cause GAS to omit the listing file or to write
it out incompletely.  In such a case, correcting parse errors is extremely
tedious -- I had to resort to adding deliberate syntax errors to
zero in on the locations of actual errors.

In contrast, the use of link-time symbols as preprocessing variables in GAS
requires them to be marked as temporary (prefixed with .L in ELF files), which
is detrimental to legibility, and the GAS 1-pass approach makes it hard to
figure out what value will actually be used.

Note that for example the IBM mainframe assembler similarly differentiates
between variables used by the macro processor and symbols.

The only drawback I can see of such a ``phased'' approach is that symbol values
cannot direct preprocessing.  As the IBM example shows, this is a viable trade off.
Comment 1 Alan Modra 2010-01-12 03:14:26 UTC
You can easily add m4 processing to gas yourself by writing a tiny wrapper
script.  I see no need to add yet another feature to gas.
Comment 2 Konrad Schwarz 2010-01-12 09:52:18 UTC
(In reply to comment #1)
> You can easily add m4 processing to gas yourself by writing a tiny wrapper
> script.  I see no need to add yet another feature to gas.

Well, like I wrote:

> This option is important for
> keasy integration with `make' default rules: e.g., in GNU make,
> adding `-Wa,-m'
> to ASFLAGS would enable using just the default rule catalog to compile such
> sources; otherwise, a separate rule needs to be created and a naming scheme
> (such as a suffix) be reserved for these files.

A further reason is interaction with gcc: If as(1) is invoked via GCC (GNU
make's default rules assume this), then the wrapper script is no longer trivial.
In particular, .S files should still be operated on by the C pre-processor
first, then by as(1) (including m4), and .s files emitted by the compiler should
have access to m4 macro expansion; e.g., for GCC inline assembler.

The interface between the GCC driver and as(1) is configured via the GCC spec
file and the spec file would need to be adapted on a per-site basis.

This is decidedly more complex compared to an additional flag in ASFLAGS, owing
to the daunting syntax of spec files and the problem of automating such a change
to the spec file during an installation process.