backslashes in quoted symbol names

Nick Clifton nickc@redhat.com
Mon Sep 13 10:14:15 GMT 2021


Hi Jan,

   Sorry for taking so long to reply to your email :-(

> I've been slowly making progress with this (without limiting the
> diagnostic to PE); I'm now at a point where only some odd testsuite
> fallout is left (for the extension to the existing test and one other
> one), which I'd rather only spend time on looking into if the general
> approach taken is deemed acceptable. The present draft patch is
> below; two prereq patches (which I think have merit in their own
> right) are attached.

I have applied the patches - they definitely make sense.

> The main question really is that of get_symbol_name() and
> read_symbol_name() acting quite differently when it comes to quoted
> symbols. At least in case read_symbol_name() was to represent the
> "canonical" model, I don't feel it to be in scope for me to address
> this more fundamental issue, yet I could see this to be viewed as the
> only sensible way out of the mess. (I think it wouldn't be overly
> much effort to re-implement read_symbol_name() to be backed by
> get_symbol_name(), so if that was the route to go, I might at least
> make an attempt - so long as the present very limited handling of
> escaped characters would be sufficient, which would mean the
> elf/syms.s testcase would have to change.)
> 
> As to testsuite fallout:
> 1) s_{nios2,pru}_set() use get_symbol_name() while s_set() uses
>     read_symbol_name(). I wonder whether I wouldn't better leave the
>     target specific functions alone and switch .set in the elf/syms.s
>     testcase to .equ, .eqv, or .equiv (presumably then also allowing
>     the #notarget: to be dropped from there).

Actually that is a very good idea.

> 2) {powerpc,rs6000}-ibm-aix*, tic30-coff, and z80-coff apparently
>     have not yet understood (by me) parsing issues. For ppc these are
>     on two of the .globl being added to all/quoted-sym-names.s, yet I
>     can't spot the target overriding the generic processing of the
>     directive, so I'm puzzled. I didn't look at the others in any
>     detail.

No worries.  I am not too concerned by AIX peculiarties, and as long as
basic symbol name parsing carries on working, I doubt that AIX users will
ever care.

> gas: rework handling of backslashes in quoted symbol names

The patch looks good to me.  I fyou are happy with it, please
go ahead and apply it.  I would much rather that we make the change
now, than have the patch languish any longer.


> Strange effects can result from the present handling, e.g.:
> 
> .if 1
> "backslash\\":
> .endif
> 
> yields first (correctly) "missing closing `"'" but then also "invalid
> character '\' in mnemonic" and further "end of file inside conditional".
> Symbols names ending in \ are in principle not expressable with that
> scheme.

Meh - once you have a syntax error, further slightly odd error messages
are not uncommon, and certainly nothing to prevent the patch from being
applied.


> I'm actually wondering whether the storing of 0 is really necessary when
> we did _not_ find a symbol name.

Yeah - I would have to look at the code again, but I would guess that it
is something to do with generating error messages.


> But perhaps a more fundamental question is whether altering the input
> buffer is okay; 

It is done in other places, so there is definitely a precedent.


> After all besides
> get_symbol_name() there also is read_symbol_name(), a comment of which
> says that it allocates a buffer because of escape character handling. I
> have no idea why there are two functions for apparently the same purpose
> in the first place,

It is bound to be a historical thing, rather than a planned design.

Cheers
   Nick



More information about the Binutils mailing list