Bug 27666

Summary: ar doesn't create indices on Solaris/sparcv9
Product: binutils Reporter: Rainer Orth <ro>
Component: binutilsAssignee: Nick Clifton <nickc>
Status: ASSIGNED ---    
Severity: normal CC: amodra, brobecker, nickc
Priority: P2    
Version: 2.37   
Target Milestone: 2.37   
Host: Target: sparcv9-sun-solaris2.*
Build: Last reconfirmed:
Attachments: Proposed patch

Description Rainer Orth 2021-03-30 12:35:21 UTC
On binutils master, quite a number of tests currently fail on Solaris/sparcv9.

E.g.

FAIL: ar thin archive

and several more where the Archive index: section is missing from the nm --print-armap output.

Likewise, several ld tests FAIL like this:

./ld-new: tmpdir/libpr25910.a: error adding symbols: archive has no index; run ranlib to add one
failed with: <./ld-new: tmpdir/libpr25910.a: error adding symbols: archive has no index; run ranlib to add one>, no expected output
FAIL: --export-dynamic-symbol foo archive

And indeed, even with an explicit ar rs no index is created.  This is not very
easy to diagnose, it seems: I needed to use file to note the common issue of
missing archive index.

Looking back, this has worked in binutils 2.35, but was already broken in 2.36.

Using side-by-side debugging with (working) Solaris/amd64 and (broken)
Solaris/sparcv9, I found that in _bfd_write_archive_contents hasobjects 
remains 0 after the bfd_check_format call.

In format.c (bfd_check_format_matches), the second call to

BFD_SEND_FMT (abfd, _bfd_check_format, (abfd))

on l.343 matches only once on amd64 (for x86_64_elf64_sol2_vec), while it
matches twice on sparcv9 (both for sparc_elf64_sol2_vec and sparc_elf64_vec).

That sort of rang a bell and I found that

https://sourceware.org/pipermail/binutils/2021-January/114778.html

commit 3677b7296185e6abfe8327c00c460712151ade15
Author: Alan Modra <amodra@gmail.com>
Date:   Wed Jan 6 12:01:10 2021 +1030

    sparc-sun-solaris2 and sparc64-sun-solaris2 config

had added the non-_sol2 vectors on Solaris/SPARC.  Reverting the config.bfd
part of the patch fixed the archive index issue.

I believe that I had omitted the non-_sol2 vectors for a reason back in 2010:

http://sourceware.org/ml/binutils/2010-10/msg00348.html

There were similar ambiguities IIRC.

On top of that, I believe the failing tests are better fixed differently:

* You've already taken the first step in making the dump files both forms of
  the target names, with and without the _sol2 suffix.

* The next step, I believe, would be to pass -m..._sol2 on Solaris (both sparc
  and x86).

* The ugliest part is that due to differences in the Solaris ABI, the dumps
  differ from the Linux ones.  Unfortunately, there's no way to represent
  conditional sections in the dumps and keeping separate copies almost guarantees
  that they diverge in the future.
Comment 1 Rainer Orth 2021-05-11 12:41:24 UTC
When I tried a sparcv9-sun-solaris2.11 build of gdb master in
preparation for the upcoming GDB 11 release, I found that the impact of
this bug is way greater than just a couple of binutils testsuite
failures: it causes many hundreds of gdb testsuite failures due to the
same ambiguity:

(gdb) break increment
Breakpoint 1 at 0x100020368: file /vol/src/gnu/gdb/hg/master/dist/gdb/testsuite/gdb.ada/O2_float_param/callee.adb, line 19.
(gdb) run
Starting program: /vol/obj/gnu/gdb/gdb/11.4-sparcv9-dist/gdb/testsuite/outputs/gdb.ada/O2_float_param/foo
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
Error while mapping shared library sections:
`/lib/64/libc.so.1': not in executable format: file format is ambiguous
Error while mapping shared library sections:
`/lib/sparcv9/ld.so.1': not in executable format: file format is ambiguous

As with binutils, reverting just the bfd/config.bfd part of the patch
returns gdb testresults back to normal.

Unless there's a considerably better (and sufficiently safe)
alternative, this is what I believe should be done.
Comment 2 Nick Clifton 2021-05-17 19:30:01 UTC
Created attachment 13455 [details]
Proposed patch

Hi Rainer,

  Would you mind trying out the uploaded patch ?

  I am working on a theory that the real bug is that _bfd_write_archive_contents() is not creating the symbol index.  As you noted the call to bfd_check_format() fails, because there are two potential matches.  But I do not think that this is grounds for the archiver to decide that there are no objects in the library.

  Even with this patch applied, most of the linker tests that are failing continue to do so, but I have not yet investigated why.  I just wanted to show you my ida, and see if you thought that it might be worth persuing.

Cheers
  Nick
Comment 3 Rainer Orth 2021-06-01 09:48:43 UTC
> --- Comment #2 from Nick Clifton <nickc at redhat dot com> ---
> Created attachment 13455 [details]
>   --> https://sourceware.org/bugzilla/attachment.cgi?id=13455&action=edit
> Proposed patch

Hi Nick,

sorry for the long delay: I've been on vacation for some time.

>   Would you mind trying out the uploaded patch ?
>
>   I am working on a theory that the real bug is that
> _bfd_write_archive_contents() is not creating the symbol index.  As you noted
> the call to bfd_check_format() fails, because there are two potential matches. 
> But I do not think that this is grounds for the archiver to decide that there
> are no objects in the library.
>
>   Even with this patch applied, most of the linker tests that are failing
> continue to do so, but I have not yet investigated why.  I just wanted to show
> you my ida, and see if you thought that it might be worth persuing.

As a first step, I've tested the patch on both amd64-pc-solaris2.11 (no
changes in test results) and sparcv9-sun-solaris2.11.  The situation is
mixed here:

@@ -45 +44,0 @@
-FAIL: --export-dynamic-symbol foo archive
@@ -48 +46,0 @@
-FAIL: --export-dynamic-symbol-list foo archive
@@ -55 +53,2 @@
-FAIL: ld link shared library
+FAIL: ld export symbols from archive

spawn [open ...]^M
0000000000100250 A _DYNAMIC
0000000000100330 A _GLOBAL_OFFSET_TABLE_
0000000000100400 A _PROCEDURE_LINKAGE_TABLE_
                 U exclude_sym
0000000000100338 D include_sym

+FAIL: ld don't exclude symbols from archive - --exclude-libs foo:bar

spawn [open ...]^M
0000000000100250 A _DYNAMIC
0000000000100330 A _GLOBAL_OFFSET_TABLE_
0000000000100400 A _PROCEDURE_LINKAGE_TABLE_
                 U exclude_sym
0000000000100338 D include_sym

@@ -109 +107,0 @@
-FAIL: ld-misc/defsym1

In both failing cases, exclude_common is missing.  The symbol is present
in exclude2.o, but missing from exclude.so.
Comment 4 Sourceware Commits 2021-06-09 10:11:59 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f75bcf7e574c511f0ea742e05e5ae6e19bf29a92

commit f75bcf7e574c511f0ea742e05e5ae6e19bf29a92
Author: Nick Clifton <nickc@redhat.com>
Date:   Wed Jun 9 11:10:16 2021 +0100

    Fix the creation of archives for Sparc Solaris2 targets by eliminating the support for generic SPARC ELF files.
    
            PR 27666
    bfd     * config.bfd: Do not add the sparc_elf32_vec or sparc_elf64_vec
            vectors to Sparc Solaris2 targets.
    
    ld      * testsuite/ld-sparc/sparc.exp: Do not run the sparctests or
            sparc64tests for Solaris2 targets.
Comment 5 Joel Brobecker 2021-06-13 14:18:13 UTC
(Thanks Nick for the patch)

Hi Rainer,

If I read Nick's patch correctly, I think things should be back to normal again in terms of behavior, thus no longer blocking the GDB 11 release. Would you be able to double-check that it's the case?

Thank you!
Comment 6 Rainer Orth 2021-06-17 07:38:57 UTC
> --- Comment #5 from Joel Brobecker <brobecker at gnat dot com> ---
> (Thanks Nick for the patch)
>
> Hi Rainer,
>
> If I read Nick's patch correctly, I think things should be back to normal again
> in terms of behavior, thus no longer blocking the GDB 11 release. Would you be
> able to double-check that it's the case?

Hi Joel,

that's indeed the case: thinks are now as they were before.
Comment 7 Rainer Orth 2021-06-17 07:52:44 UTC
> --- Comment #5 from Joel Brobecker <brobecker at gnat dot com> ---
> (Thanks Nick for the patch)

Indeed, thanks a lot.

However, I think we can do better than disabling the ld sparc*tests on
Solaris.  Even if we omit support for the elf32_sparc and and
elf64_sparc emulations, it should be possible to handle that with a
snippet like the following in sparc.exp (tested a very long time ago,
not sure if it still works as is):

# Solaris/SPARC ld uses elf??-sparc_sol2 emulations.
if { [istarget "*-*-solaris*"] } {
    set options_regsub(ld) {-m(\\S+) -m\\1_sol2}
    verbose "solaris: options_regsub(ld)"
}

Alan has also done some of the work by making the dump files accept both
elf32-sparc and elf32-sparc-sol2.  The ugliest part (or one which really
requires a general solution) is that the Solaris ABI couple of
additional symbols into the output files, like

/var/gcc/binutils/sparcv9/obj/binutils/ld/../binutils/readelf -WSsrl tmpdir/libtlssunpic32.so
regexp_diff match failure
regexp "^.* TLS +GLOBAL +DEFAULT +7 sg8$"
line   "     4: 00012188     0 OBJECT  LOCAL  DEFAULT   11 _END_"
regexp_diff match failure
regexp "^.* TLS +GLOBAL +DEFAULT +7 sg3$"
line   "     5: 00001000     0 OBJECT  LOCAL  DEFAULT    6 _START_"
[...]

One possible solution to that is to have separate copies for Solaris (as
done in gas in a few cases for x86), but that quickly becomes
unmaintainable with master dump and Solaris copy beginning to drift
apart.

What I think would be a general solution is having support for
conditionals in the dump files, something like

  #cond <cond>: GLOB|PROC

  set condition cond to true if GLOB matches or PROC returns true, false
  otherwise, e.g.

  #cond SOLARIS: *-*-solaris2* 

  run_dump_test calls to regexp_diff for comparisons

  #if:<cond>	<rest of line>

  e.g.

  #if:SOLARIS	<rest of line>

  need more:

  #if:SOLARIS
  #else

  for individual lines, maybe also for blocks?

  then we need

  #if:SOLARIS
  <lines>
  #else
  <lines>
  #endif

  no nesting for simplicity's sake

Just a general idea because this issue seems to come up on other
targets, too.  I haven't even started thinking about implemented this
and probably won't for quite some time.

	Rainer