Bug 22870 - slow aarch64 assembler for source with lots of .loc directives
Summary: slow aarch64 assembler for source with lots of .loc directives
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Alexandre Oliva
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-20 20:39 UTC by aldyh
Modified: 2019-06-16 10:03 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
aarch64 assembler testcase (3.35 MB, application/gzip)
2018-02-20 20:39 UTC, aldyh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description aldyh 2018-02-20 20:39:30 UTC
Created attachment 10840 [details]
aarch64 assembler testcase

Hi.

After the following patch in GCC land, providing finer grained .loc directives for inline functions, assembler time is significantly higher:

https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00481.html

tor:~/bld/aarch64-combined/gcc$ time ./as poly-big.s
real    0m20.108s
user    0m19.847s
sys     0m0.203s
tor:~/bld/aarch64-combined/gcc$ time ./as poly-small.s
real    0m0.622s
user    0m0.574s
sys     0m0.046s
tor:~/bld/aarch64-combined/gcc$ grep \.loc poly-big.s|wc
 103773  784669 4036367
tor:~/bld/aarch64-combined/gcc$ grep \.loc poly-small.s|wc
  30423  210225 1096181

I am reporting this, just in case the binutils team would like this addressed.

This was originally reported as PR84408 in GCC:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84408
Comment 1 Alan Modra 2018-02-21 00:05:21 UTC
gas is generating enormous chains of symbols.

(gdb) set args ../pr22870-big.s
(gdb) r
Starting program: /home/alan/build/gas/aarch64-linux/gas/as-new ../pr22870-big.s 
  C-c C-c
Program received signal SIGINT, Interrupt.
0x0000000000423277 in S_GET_SEGMENT (s=<optimised out>) at /home/alan/src/binutils-gdb/gas/symbols.c:2221
2221	  return s->bsym->section;
(gdb) up
#1  resolve_symbol_value (symp=0xb93b70, symp@entry=0xb93ab0) at /home/alan/src/binutils-gdb/gas/symbols.c:1177
1177		  seg_left = S_GET_SEGMENT (add_symbol);
(gdb) p *add_symbol
$1 = {sy_flags = {sy_local_symbol = 1, sy_written = 0, sy_resolved = 0, sy_resolving = 0, sy_used_in_reloc = 0, sy_used = 0, sy_volatile = 0, sy_forward_ref = 0, sy_mri_common = 0, sy_weakrefr = 0, sy_weakrefd = 0}, bsym = 0x7df660, sy_value = {X_add_symbol = 0xb938a0, X_op_symbol = 0xb93970, X_add_number = 0, X_op = O_illegal, X_unsigned = 0, X_extrabit = 0, X_md = 0}, sy_next = 0x1304c, sy_previous = 0x0, sy_frag = 0x0, sy_obj = {local = 12140608, size = 0x0, versioned_name = 0x0}, sy_tc = 0}
(gdb) p *((struct local_symbol *)add_symbol)
$2 = {lsy_flags = {sy_local_symbol = 1, sy_written = 0, sy_resolved = 0, sy_resolving = 0, sy_used_in_reloc = 0, sy_used = 0, sy_volatile = 0, sy_forward_ref = 0, sy_mri_common = 0, sy_weakrefr = 0, sy_weakrefd = 0}, lsy_section = 0x7df660, lsy_name = 0xb938a0 ".LVU4166", u = {lsy_frag = 0xb93970, lsy_sym = 0xb93970}, lsy_value = 0}
(gdb) up
#2  0x0000000000422d80 in resolve_symbol_value (symp=0xb96020, symp@entry=0xb95f60) at /home/alan/src/binutils-gdb/gas/symbols.c:1176
1176		  left = resolve_symbol_value (add_symbol);
(gdb) p *add_symbol
$3 = {sy_flags = {sy_local_symbol = 1, sy_written = 0, sy_resolved = 0, sy_resolving = 0, sy_used_in_reloc = 0, sy_used = 0, sy_volatile = 0, sy_forward_ref = 0, sy_mri_common = 0, sy_weakrefr = 0, sy_weakrefd = 0}, bsym = 0x7df660, sy_value = {X_add_symbol = 0xb93aa0, X_op_symbol = 0xb93b70, X_add_number = 0, X_op = O_illegal, X_unsigned = 0, X_extrabit = 0, X_md = 0}, sy_next = 0x1304c, sy_previous = 0x0, sy_frag = 0x0, sy_obj = {local = 12140800, size = 0x0, versioned_name = 0x0}, sy_tc = 0}
(gdb) p *((struct local_symbol *)add_symbol)
$4 = {lsy_flags = {sy_local_symbol = 1, sy_written = 0, sy_resolved = 0, sy_resolving = 0, sy_used_in_reloc = 0, sy_used = 0, sy_volatile = 0, sy_forward_ref = 0, sy_mri_common = 0, sy_weakrefr = 0, sy_weakrefd = 0}, lsy_section = 0x7df660, lsy_name = 0xb93aa0 ".LVU4167", u = {lsy_frag = 0xb93b70, lsy_sym = 0xb93b70}, lsy_value = 0}
(gdb) up 99999
#4663 main (argc=<optimised out>, argv=<optimised out>) at /home/alan/src/binutils-gdb/gas/as.c:1283
1283	  perform_an_assembly_pass (argc, argv);
(gdb) down
#4662 0x0000000000403fb8 in perform_an_assembly_pass (argv=0x7c8e98, argc=<optimised out>) at /home/alan/src/binutils-gdb/gas/as.c:1161
1161		  read_a_source_file (*argv);
(gdb) 
#4661 0x000000000041ec4b in read_a_source_file (name=<optimised out>) at /home/alan/src/binutils-gdb/gas/read.c:1148
1148			      (*pop->poc_handler) (pop->poc_val);
(gdb) 
#4660 0x000000000041c32f in cons_worker (nbytes=8, rva=0) at /home/alan/src/binutils-gdb/gas/read.c:4081
4081		  ret = TC_PARSE_CONS_EXPRESSION (&exp, (unsigned int) nbytes);
(gdb) 
#4659 0x000000000040f49d in expr (rankarg=rankarg@entry=0, resultP=resultP@entry=0x7fffffffda80, mode=mode@entry=expr_normal) at /home/alan/src/binutils-gdb/gas/expr.c:1807
1807	  retval = operand (resultP, mode);
(gdb) 
#4658 0x000000000040f21f in operand (expressionP=expressionP@entry=0x7fffffffda80, mode=mode@entry=expr_normal) at /home/alan/src/binutils-gdb/gas/expr.c:1357
1357		      expressionP->X_add_number = S_GET_VALUE (symbolP);
(gdb) p *symbolP
$5 = {sy_flags = {sy_local_symbol = 0, sy_written = 0, sy_resolved = 0, sy_resolving = 1, sy_used_in_reloc = 0, sy_used = 1, sy_volatile = 0, sy_forward_ref = 0, sy_mri_common = 0, sy_weakrefr = 0, sy_weakrefd = 0}, bsym = 0xf7fc70, sy_value = {X_add_symbol = 0xf81fd0, X_op_symbol = 0x0, X_add_number = 1, X_op = O_symbol, X_unsigned = 1, X_extrabit = 0, X_md = 0}, sy_next = 0xf827c0, sy_previous = 0xf82290, sy_frag = 0x7c6f60 <zero_address_frag>, sy_obj = {local = 0, size = 0x0, versioned_name = 0x0}, sy_tc = 0}
(gdb) p *symbolP->bsym
$6 = {the_bfd = 0x7de110, name = 0xf82300 ".LVU8808", value = 0, flags = 0, section = 0x7a0590 <_bfd_std_section+560>, udata = {p = 0x0, i = 0}}
(gdb) down
#4657 0x000000000042386c in S_GET_VALUE (s=<optimised out>, s@entry=0xf82310) at /home/alan/src/binutils-gdb/gas/symbols.c:1964
1964	      valueT val = resolve_symbol_value (s);
(gdb) 
#4656 0x0000000000422d80 in resolve_symbol_value (symp=symp@entry=0xf82310) at /home/alan/src/binutils-gdb/gas/symbols.c:1176
1176		  left = resolve_symbol_value (add_symbol);
(gdb) p *add_symbol
$7 = {sy_flags = {sy_local_symbol = 1, sy_written = 0, sy_resolved = 0, sy_resolving = 0, sy_used_in_reloc = 0, sy_used = 0, sy_volatile = 0, sy_forward_ref = 0, sy_mri_common = 0, sy_weakrefr = 0, sy_weakrefd = 0}, bsym = 0x7df660, sy_value = {X_add_symbol = 0xf81fc0, X_op_symbol = 0xf82090, X_add_number = 0, X_op = O_illegal, X_unsigned = 0, X_extrabit = 0, X_md = 0}, sy_next = 0x1304c, sy_previous = 0x0, sy_frag = 0x0, sy_obj = {local = 16251728, size = 0x0, versioned_name = 0x0}, sy_tc = 0}
(gdb) p *((struct local_symbol *)add_symbol)
$8 = {lsy_flags = {sy_local_symbol = 1, sy_written = 0, sy_resolved = 0, sy_resolving = 0, sy_used_in_reloc = 0, sy_used = 0, sy_volatile = 0, sy_forward_ref = 0, sy_mri_common = 0, sy_weakrefr = 0, sy_weakrefd = 0}, lsy_section = 0x7df660, lsy_name = 0xf81fc0 ".LVU8807", u = {lsy_frag = 0xf82090, lsy_sym = 0xf82090}, lsy_value = 0}
(gdb) 

So .LVU8808 is defined from .LVU8807, which is defined from ... all the way down to at least .LVU4166 (where the symbol resolution was interrupted) and I'll bet down to .LVU1.  What's more, this symbol resolution is happening before finalize_syms is set so the value calculated for .LVU8808 won't be saved and thus won't cut short the recursion when gas wants the value for .LVU8809.
Comment 2 Alexandre Oliva 2018-02-21 00:30:54 UTC
I suspect the source of the problem is the lack of view reset asserts.
Without that, the assembler gets an uninterrupted chain 
of symbolic views, in which each view is computed based on the view 
before it and the offset between the addresses in which they were 
issued.  Depending on the order in which the assembler attempts to 
resolve each view to a constant, it could get expensive.  It was never 
meant for this kind of use: it was expected that most views would be 
zero-asserts, so this wouldn't arise.

The issuance of view asserts was disabled in GCC, triggering this
undesirable behavior in the assembler, but I'm not sure I'd consider it
a bug in gas.  GCC can and should avoid these long chains, issuing view
reset asserts after insns known to generate code.  It has code to do so,
but it's disabled because a few targets had wrong lengths and need
auditing and possibly overriding in a target hook.
Comment 3 Alan Modra 2018-02-21 00:54:54 UTC
> GCC can and should avoid these long chains, issuing view
> reset asserts after insns known to generate code.

That suggest a solution.  The assembler certainly knowns when code has been generated, so can it "reset the view"?  dwarf2_emit_insn comes to mind as a likely place to implement this.
Comment 4 Alexandre Oliva 2018-02-22 18:16:38 UTC
IIRC it already does that; it only emits symbolic expressions when, at the time it sees the .loc directive, it can't yet determine whether . > .LVL(#-1) (IOW, have we advanced PC since the previous view in the same subsection?).  If we can do more to resolve these early, we will certainly cut short the expression bloat and dep chain.

Now, even if we can't determine that earlier, we could still do better.  We multiply the negation of the above by (.LVU# + 1) (IOW, the successor to the previous view).  If the multiplication could take a shortcut if one of the operands is found to be zero (it often will be, in this particular case) and resolve the result to zero (not necessarily for good, since relaxing might turn an align to nothing), we'd get much shorter evaluations overall.

If optimizing multiplication is no good, maybe we could introduce a ternary operator for internal use, for use in this case?


All that said, GCC will soon cut the chains short at function entry points, where it will force a view reset.  This should shrink the chain lengths significantly, but I guess we can still use any of the suggested above to improve it so we only hit the O(2^n) case with small n.
Comment 5 Alexandre Oliva 2018-02-22 18:17:51 UTC
Mine, just awaiting some feedback on the proposed courses of action.
Comment 6 Alexandre Oliva 2019-06-16 10:03:16 UTC
Uhh...  Do you still observe this slow down?  We've fixed other instances of this at various times.  Please reopen if it's still observable.