Bug 29107 - gas testsuite parallel jobs fail (gprofng?)
Summary: gas testsuite parallel jobs fail (gprofng?)
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gprofng (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Vladimir Mezentsev
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-05-01 00:13 UTC by Toolybird
Modified: 2024-01-19 02:19 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-10-10 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Toolybird 2022-05-01 00:13:51 UTC
** Please reassign this to gprofng if you believe the problem lies there **

I'm building binutils trunk on x86_64 and seeing testsuite failures in gas, but only in the following circumstances:

 - gprofng is enabled
 - --enable shared
 - the whole binutils testsuite is run with parallel make jobs

Inside the log file the errors look like this:

exited abnormally with 1, output:/build/binutils/src/binutils-build/binutils/.libs/lt-objdump: error while loading shared libraries: /build/binutils/src/binutils-build/opcodes/.libs/libopcodes-2.38.50.20220430.so: file too short

/build/binutils/src/binutils-build/gas/.libs/lt-as-new: error while loading shared libraries: /build/binutils/src/binutils-build/opcodes/.libs/libopcodes-2.38.50.20220430.so: invalid ELF header

Bisection (obviously) points to the introduction of gprofng.

Any thoughts? Thanks.


Steps to reproduce:

1. ../binutils-gdb/configure --prefix=/usr --with-system-zlib --enable-shared --disable-{gdb,gdbserver,libbacktrace,libdecnumber,readline,sim}

2. make -j16 -O tooldir=/usr V=1

3. make -k -j16 -O check
Comment 1 Nick Clifton 2022-05-18 14:52:50 UTC
(In reply to Toolybird from comment #0)
 
> Any thoughts? Thanks.
 
Don't run the testsuites in parallel ?  :-)  

OK, so not helpful.  Instead some questions - if you run the builds in parallel, but then the checks sequentially, does everything work.  How about the other way around (sequential builds, parallel tests) ?

The thing that struck me about the error message was that it looks like the opcodes library is in the process of being built whilst the tests are running.  So maybe - just a theory - the gprofng testsuite is triggering a *re-build* of the opcodes library for some reason.  (Because of some strange makefile dependency maybe ?)

Another wild guess - maybe adding the .NOTPARALLEL: pseudo-target to the gprofng Makefile.am would help ?  Just before the check-DEJAGNU: target ?

That is about all that comes to mind at the moment, however.  Sorry.
Comment 2 Xi Ruoyao 2022-10-03 09:27:55 UTC
(In reply to Nick Clifton from comment #1)
> (In reply to Toolybird from comment #0)
>  
> > Any thoughts? Thanks.
>  
> Don't run the testsuites in parallel ?  :-)  
> 
> OK, so not helpful.  Instead some questions - if you run the builds in
> parallel, but then the checks sequentially, does everything work.  How about
> the other way around (sequential builds, parallel tests) ?
> 
> The thing that struck me about the error message was that it looks like the
> opcodes library is in the process of being built whilst the tests are
> running.  So maybe - just a theory - the gprofng testsuite is triggering a
> *re-build* of the opcodes library for some reason.  (Because of some strange
> makefile dependency maybe ?)

The reason is gprofng tests installs libopcodes.so into a chroot environment, and libtool (a stupid thing trying to be *too* clever, I must add) relinks a shared object being installed.

> Another wild guess - maybe adding the .NOTPARALLEL: pseudo-target to the
> gprofng Makefile.am would help ?  Just before the check-DEJAGNU: target ?

I'll try.
Comment 3 Xi Ruoyao 2022-10-03 09:51:15 UTC
(In reply to Xi Ruoyao from comment #2)
> (In reply to Nick Clifton from comment #1)

> > Another wild guess - maybe adding the .NOTPARALLEL: pseudo-target to the
> > gprofng Makefile.am would help ?  Just before the check-DEJAGNU: target ?
> 
> I'll try.

Hmm, it does not work for me.  AFAIK it's a race between check-gas and check-gprofng, so changing something in gprofng subdirectory won't help...
Comment 4 Nick Clifton 2022-10-03 14:14:11 UTC
(In reply to Xi Ruoyao from comment #3)
 
> Hmm, it does not work for me.  AFAIK it's a race between check-gas and
> check-gprofng, so changing something in gprofng subdirectory won't help...

Agreed.  

This appears to be an artifact of the gprofng's testing environment.  In
particular in gprofng/testsuite/config/default.exp there is:

  # Make a temporary install dir to run gprofng from, and point at it
  remote_exec host "sh -c \"rm -rf tmpdir; mkdir -p tmpdir; $MAKE -C .. install-gprofng 
  program_transform_name= DESTDIR=`pwd`/tmpdir/root\""

Which is probably the trigger.  I am reassigning this PR to gprofng in 
the hopes that one of the maintainers will know why the installation is
necessary and may be able to come up with a workaround.
Comment 5 Vladimir Mezentsev 2022-10-06 05:16:36 UTC
This is a side-effect of our fake installation for gprofng testing:
>>  # Make a temporary install dir to run gprofng from, and point at it
>>  remote_exec host "sh -c \"rm -rf tmpdir; mkdir -p tmpdir; $MAKE -C .. install-gprofng 
>>  program_transform_name= DESTDIR=`pwd`/tmpdir/root\""

 I will fix this and will not run '$MAKE install-gprofng'.


But why is libopcodes.la rebuilt when I run `make install` ?

To reproduce:
% ../binutils-gdb.git/configure --prefix=`pwd`/INSTALL --with-system-zlib --enable-shared --disable-{gdb,gdbserver,libbacktrace,libdecnumber,readline,sim}
% make -j 100
% make install
...
make[4]: Nothing to be done for 'install-exec-am'.
 /bin/mkdir -p '/data2/vmezents/bld/INSTALL/include'
 /bin/install -c -m 644 ../../binutils-gdb.git/opcodes/../include/dis-asm.h '/data2/vmezents/bld/INSTALL/include'
 /bin/mkdir -p '/data2/vmezents/bld/INSTALL/lib'
 /bin/sh ./libtool   --mode=install /bin/install -c   libopcodes.la '/data2/vmezents/bld/INSTALL/lib'
libtool: install: warning: relinking `libopcodes.la'
                  ^^^^^^^^^^^^^^^^^^
                  | Why ? There were no changes.


libtool: install: (cd /data2/vmezents/bld/opcodes; /bin/sh /data2/vmezents/bld/opcodes/libtool  --silent --tag CC --mode=relink gcc 
  -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wshadow 
  -Wstack-usage=262144 -Werror -g -O2 -release 2.39.50.20221006 
  -o libopcodes.la 
  -rpath /data2/vmezents/bld/INSTALL/lib dis-buf.lo disassemble.lo 
  dis-init.lo i386-dis.lo i386-opc.lo ../bfd/libbfd.la 
  -L/data2/vmezents/bld/opcodes/../libiberty/pic -liberty 
  -Wl,-lc,--as-needed,-lm,--no-as-needed )
....

We see libopcodes.la was rebuilt.
Comment 6 Xi Ruoyao 2022-10-06 09:07:39 UTC
(In reply to Vladimir Mezentsev from comment #5)
> This is a side-effect of our fake installation for gprofng testing:
> >>  # Make a temporary install dir to run gprofng from, and point at it
> >>  remote_exec host "sh -c \"rm -rf tmpdir; mkdir -p tmpdir; $MAKE -C .. install-gprofng 
> >>  program_transform_name= DESTDIR=`pwd`/tmpdir/root\""
> 
>  I will fix this and will not run '$MAKE install-gprofng'.
> 
> 
> But why is libopcodes.la rebuilt when I run `make install` ?

See section 4.4 "Install mode" of libtool info page.  I always think libtool is a stupid thing attempting to be *too* clever.
Comment 7 Sourceware Commits 2022-10-11 07:18:09 UTC
The master branch has been updated by Vladimir Mezentsev <vmezents@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a665c4d5c6e1d23d31ac434949b9243025496aef

commit a665c4d5c6e1d23d31ac434949b9243025496aef
Author: Vladimir Mezentsev <vladimir.mezentsev@oracle.com>
Date:   Mon Oct 10 12:57:19 2022 -0700

    gprofng: run tests without installation
    
    gprofng/ChangeLog
    2022-10-10  Vladimir Mezentsev  <vladimir.mezentsev@oracle.com>
    
            PR gprofng/29107
            * testsuite/config/default.exp: Set up environment to run gprofng tests
            without installation.
            * testsuite/lib/Makefile.skel: Likewise.
            * testsuite/lib/display-lib.exp: Likewise.
Comment 8 Vladimir Mezentsev 2022-10-11 23:07:04 UTC
Update status as resolved/fixed.