Bug 18427 - GNU as slow on hppa architecture
Summary: GNU as slow on hppa architecture
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-18 19:15 UTC by Helge Deller
Modified: 2015-06-11 23:06 UTC (History)
3 users (show)

See Also:
Host:
Target: hppa
Build:
Last reconfirmed:


Attachments
preprocessed assembler testcase (2.29 MB, application/x-gunzip)
2015-05-18 19:15 UTC, Helge Deller
Details
initial version of a proposed patch (415 bytes, patch)
2015-05-18 19:17 UTC, Helge Deller
Details | Diff
Extended patch (1.99 KB, patch)
2015-05-29 16:20 UTC, Nick Clifton
Details | Diff
last-label.d.txt (953 bytes, text/plain)
2015-06-03 19:02 UTC, dave.anglin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Helge Deller 2015-05-18 19:15:17 UTC
Created attachment 8323 [details]
preprocessed assembler testcase

I noticed that the GNU assembler is sometimes incredible slow with assembly of C++ files on hppa. 

Because of that I did a gprof analysis (using the cross-assembler for hppa running on x86_64) while running gas on a somewhat bigger assembler file (preprocessed from the debian vtk package).
 
Here is the result, and it's pretty clear:

Flat profile:
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 81.76     86.38    86.38   821598     0.11     0.11  pa_undefine_label
  3.34     89.91     3.53   194747     0.02     0.02  pa_define_label
  1.66     91.66     1.75                             read_a_source_file
  1.13     92.85     1.19  2442950     0.00     0.00  frag_offset_fixed_p
  0.73     93.62     0.77      660     1.17     1.17  do_scrub_chars
  0.63     94.29     0.67                             symbol_print_statistics
  0.55     94.87     0.58  1414839     0.00     0.00  operand
  0.46     95.36     0.49  3866615     0.00     0.00  frag_now_fix
...
  0.00    105.65     0.00        1     0.00    76.43  resolve_local_symbol_values
  0.00    105.65     0.00        1     0.00     0.00  s_app_file
  0.00    105.65     0.00        1     0.00     0.01  s_space
  0.00    105.65     0.00        1     0.00     0.00  s_weakref
  0.00    105.65     0.00        1     0.00     0.00  symbol_insert

So, the pa_undefine_label() function seems problematic.
I talked to Dave Anglin about that issue, and he came up with the attached patch. This patch seems to work and speeds up gas a lot, but it will most likely break HP-UX (which uses multiple segments).

Maybe the experts here have some other idea?
A gzipped-testcase which expands to a 21MB assembler-file is attached.
Comment 1 Helge Deller 2015-05-18 19:17:01 UTC
Created attachment 8324 [details]
initial version of a proposed patch
Comment 2 dave.anglin 2015-05-18 20:33:34 UTC
On 2015-05-18, at 3:15 PM, deller at gmx dot de wrote:

> So, the pa_undefine_label() function seems problematic.
> I talked to Dave Anglin about that issue, and he came up with the attached
> patch. This patch seems to work and speeds up gas a lot, but it will most
> likely break HP-UX (which uses multiple segments).

Actually, ELF targets such as Linux have multiple segments.  Assembly time
blows up when -ffunction-sections and -fdata-sections are used.  The lists for
each segment are poorly managed, so both the number of lists and their length
grows.

All this overhead is to find the previous label emitted in the current space/segment.

In practice, GCC never plays with spaces/segments between emitting the label and
its associated instruction.  So, the previous label is what's needed.  The tracking is
for hand crafted assembly code.

The patch doesn't affect 32-bit HP-UX SOM.  The patch might affect 64-bit HP-UX.
I haven't tried it on 64-bit HP-UX.

--
John David Anglin	dave.anglin@bell.net
Comment 3 Nick Clifton 2015-05-29 16:20:16 UTC
Created attachment 8336 [details]
Extended patch

Hi Guys,

  I dislike the idea of disabling code that might actually be needed in some circumstances.  So how about this alternative patch ?  It extends David's patch by adding a command line option: --enable-multi-seg-support.  If this option is used the old, slow, searching code is used.  Otherwise the default action is to use the new, faster search code.

  What do you think, will this work ?

Cheers
  Nick
Comment 4 dave.anglin 2015-05-29 18:36:00 UTC
On 2015-05-29 12:20 PM, nickc at redhat dot com wrote:
>   I dislike the idea of disabling code that might actually be needed in some
> circumstances.  So how about this alternative patch ?  It extends David's patch
> by adding a command line option: --enable-multi-seg-support.  If this option is
> used the old, slow, searching code is used.  Otherwise the default action is to
> use the new, faster search code.
Hi Nick,

Thanks for helping with this change.

I'm not sure about the option.  Alpha has somewhat similar code without 
an option.

As I understand it, this code is to handle a number of instructions 
which require a preceding
label.  It usually emitted on the previous line.  It seems to me that 
changing segments between
the label and its related instruction should be an error.  I don't think 
this feature is needed
even for the HP-UX SOM target.

I will check whether GCC needs this for SOM.

Dave
Comment 5 Nick Clifton 2015-06-01 08:01:21 UTC
Hi Dave,

> Thanks for helping with this change.
>
> I'm not sure about the option.  Alpha has somewhat similar code without
> an option.
>
> As I understand it, this code is to handle a number of instructions
> which require a preceding
> label.  It usually emitted on the previous line.  It seems to me that
> changing segments between
> the label and its related instruction should be an error.  I don't think
> this feature is needed
> even for the HP-UX SOM target.
>
> I will check whether GCC needs this for SOM.

OK - thanks.  If you can confirm that support for this behaviour is no 
longer needed then I will be happy to go with the simpler patch.  (Or 
maybe the simpler one + an error message if the preceding label cannot 
be found).  I just did not want to break anybody's builds because of a 
speed optimization,

Cheers
   Nick
Comment 6 dave.anglin 2015-06-03 19:02:24 UTC
Created attachment 8341 [details]
last-label.d.txt

Hi Nick,

On 2015-06-01 4:01 AM, nickc at redhat dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=18427
>
> --- Comment #5 from Nick Clifton <nickc at redhat dot com> ---
> Hi Dave,
>
>> Thanks for helping with this change.
>>
>> I'm not sure about the option.  Alpha has somewhat similar code without
>> an option.
>>
>> As I understand it, this code is to handle a number of instructions
>> which require a preceding
>> label.  It usually emitted on the previous line.  It seems to me that
>> changing segments between
>> the label and its related instruction should be an error.  I don't think
>> this feature is needed
>> even for the HP-UX SOM target.
>>
>> I will check whether GCC needs this for SOM.
> OK - thanks.  If you can confirm that support for this behaviour is no
> longer needed then I will be happy to go with the simpler patch.  (Or
> maybe the simpler one + an error message if the preceding label cannot
> be found).  I just did not want to break anybody's builds because of a
> speed optimization,
Attached is a change to implement the above handling for the "last" label.

The gas testsuite for the hpux SOM target is clean.  There are four 
fails with 64-bit
hpux ELF target.  None of these appear related to the change. Haven't 
tested linux yet.

The 64-bit gas fails are:
FAIL: gas/all/none
FAIL: Multibyte symbol names
FAIL: weak and common directives
FAIL: common and weak directives

The latter two are caused by the behavior of the .comm directive. These 
two would pass
with foobar label before the .comm.

We have for gas/all/none:
# ../as-new -o dump.o none.s
# /opt/gnu64/bin/objdump -r -w dump.o

dump.o:     file format elf64-hppa

RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE              VALUE
0000000000000000 R_PARISC_PCREL64  *ABS*

We have for "Multibyte symbol names:
# ../as-new -o dump.o syms.s
# /opt/gnu64/bin/readelf  -S -s -p .strtab dump.o
There are 8 section headers, starting at offset 0x120:

Section Headers:
   [Nr] Name              Type             Address           Offset
        Size              EntSize          Flags  Link  Info  Align
   [ 0]                   NULL             0000000000000000  00000000
        0000000000000000  0000000000000000           0     0     0
   [ 1] .text             PROGBITS         0000000000000000  00000040
        0000000000000000  0000000000000000  AX       0     0     1
   [ 2] .data             PROGBITS         0000000000000000  00000040
        0000000000000000  0000000000000000  WA       0     0     1
   [ 3] .bss              NOBITS           0000000000000000  00000040
        0000000000000000  0000000000000000  WA       0     0     1
   [ 4] sec▒▒tion         PROGBITS         0000000000000000  00000040
        0000000000000009  0000000000000000           0     0     1
   [ 5] .shstrtab         STRTAB           0000000000000000  000000ea
        0000000000000036  0000000000000000           0     0     1
   [ 6] .symtab           SYMTAB           0000000000000000  00000050
        0000000000000090  0000000000000018           7     6     8
   [ 7] .strtab           STRTAB           0000000000000000  000000e0
        000000000000000a  0000000000000000           0     0     1
Key to Flags:
   W (write), A (alloc), X (execute), M (merge), S (strings)
   I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
   O (extra OS processing required) o (OS specific), p (processor specific)

Symbol table '.symtab' contains 6 entries:
    Num:    Value          Size Type    Bind   Vis      Ndx Name
      0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
      1: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
      2: 0000000000000000     0 SECTION LOCAL  DEFAULT    2
      3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
      4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
      5: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    4 sy▒▒mbol

String dump of section '.strtab':
   [tx]  sy▒▒mbol

Dave
Comment 7 Nick Clifton 2015-06-08 10:19:45 UTC
Hi Dave,

  Your patch is approved - please apply it when you have the time.

Cheers
  Nick
Comment 8 dave.anglin 2015-06-08 15:33:04 UTC
On 6/8/2015 6:19 AM, nickc at redhat dot com wrote:
>    Your patch is approved - please apply it when you have the time.
Are there guidelines for git commits somewhere?  I'll give this a try 
tonight.

Regards,
Dave
Comment 9 Nick Clifton 2015-06-08 15:48:09 UTC
Hi Dave,

> Are there guidelines for git commits somewhere?

Surprisingly: no. :-(

Basically the guidelines are as follows:

   * Test your patch before committing it.
   * Include a ChangeLog entry/entries.
   * Include a testcase if possible/appropriate.
   * In the commit message, start with a single sentence that describes 
the change, then leave a blank line, then include the changelog entries.

That's about it.

Cheers
   Nick