Bug 29596 - Parallel "make check-sim" is broken (unexpectedly passes without actually testing)
Summary: Parallel "make check-sim" is broken (unexpectedly passes without actually tes...
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: sim (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-22 04:44 UTC by Tsukasa OI
Modified: 2022-10-24 06:32 UTC (History)
2 users (show)

See Also:
Host:
Target: * (confirmed on riscv64-unknown-elf and aarch64-unknown-elf)
Build:
Last reconfirmed:


Attachments
testrun.1.log: AArch64 "make check-sim" log before the patch (708 bytes, text/plain)
2022-09-22 04:44 UTC, Tsukasa OI
Details
testrun.2.log: AArch64 "make check-sim" log after the patch (but breaking a testcase to "successfully" fail) (1.79 KB, text/plain)
2022-09-22 04:46 UTC, Tsukasa OI
Details
testrun.3.log: AArch64 "make check-sim" log after the patch (with original testcases to pass correctly) (1.73 KB, text/plain)
2022-09-22 04:47 UTC, Tsukasa OI
Details
site.exp: Auto-generated simulator testsuite configuration (763 bytes, text/plain)
2022-09-22 04:48 UTC, Tsukasa OI
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tsukasa OI 2022-09-22 04:44:46 UTC
Created attachment 14344 [details]
testrun.1.log: AArch64 "make check-sim" log before the patch

While I'm testing whether PR29595 is fixed, I found another bug.  "make check-sim" doesn't work because it cannot detect a suitable assembler/compiler.

This is what I'm repeatedly pinging but there's no response.  At least, it's worth it to track here.

Patch:
<https://sourceware.org/pipermail/gdb-patches/2022-August/191564.html>
<https://sourceware.org/pipermail/gdb-patches/2022-September/191843.html> (ping 2)
<https://sourceware.org/pipermail/gdb-patches/2022-September/191997.html> (ping 3)

I first reproduced this on riscv64-unknown-elf but talking with aarch64-unknown-elf will be better since AArch64 has decent number of tests (in contrast to 1 from RISC-V).


[How to reproduce]

1.  Configure Binutils with aarch64-unknown-elf and build it
    /src/binutils/configure --target=aarch64-unknown-elf && make
2.  Run `make check-sim' and confirmed that the simulator tests "pass"
3.  Intentionally try to fail the test by modifying
    `sim/testsuite/aarch64/pass.s'
    (replace the last line from "pass" to "fail")
4.  Run `make check-sim' and "confirmed" that the simulator tests "pass"
    (it should fail!)


[Analysis]

... Yes, something is going wrong.
After the test, you can see the test log in `sim/testsuite/aarch64/allinsn/testrun.log' (example: attached testrun.1.log).
We clearly find that the test runner didn't find recently built assembler (gas/as-new).
That assembler (and the linker) is supposed to be used because the simulator itself (sim/Makefile) sets its configuration (see attached site.exp, generated by sim/Makefile).

At last, we find `sim/testsuite/lib/sim-defs.exp'.
In the `sim_init_toolchain' function, it extracts {AS,LD,CC}_FOR_TARGET_AARCH64 and sets proper {AS,LD,CC}_FOR_TARGET.
At least, it is supposed to do so.

However, this block doesn't work because the `arch' variable returned by the `sim_arch' function is "./aarch64".  That is supposed to be "aarch64".


[Fix: Cause and Patch Details]

`sim_arch' function is simple.  Until the dirname part of the "arch" is not
"." (current directory), it trims the filename part.

For instance, if subdir is "A/B/C/D", this function returns "A".
However, if subdir is "./A/B/C/D", this function returns "./A", not "A".
In fact, actual subdir value here is "./aarch64" and we will get "./aarch64" for [sim_arch].
As a result, it fails to extract proper assembler/linker/compiler for given target.

To deal with it, I added another "file tail" function call before returning.

After this patch, `sim/testsuite/aarch64/allinsn/testrun.log' will look like attached testrun.2.log.
We can confirm that the assembler is detected and "expectedly" fail (because we intentionally broke a testcase).

Of course, fixing a testcase will make a success (see attached testrun.3.log).


[My status]

I completed copyright assignment of GDB to FSF in August 2022 and I'm now even a "write after approval" committer.  If a maintainer gives me an approval, I can commit it in a day.
Comment 1 Tsukasa OI 2022-09-22 04:46:40 UTC
Created attachment 14345 [details]
testrun.2.log: AArch64 "make check-sim" log after the patch (but breaking a testcase to "successfully" fail)
Comment 2 Tsukasa OI 2022-09-22 04:47:23 UTC
Created attachment 14346 [details]
testrun.3.log: AArch64 "make check-sim" log after the patch (with original testcases to pass correctly)
Comment 3 Tsukasa OI 2022-09-22 04:48:00 UTC
Created attachment 14347 [details]
site.exp: Auto-generated simulator testsuite configuration
Comment 4 Tsukasa OI 2022-10-12 05:35:34 UTC
Result of "runtest --version" (on Ubuntu 22.04.1 LTS; x86_64) is:
> DejaGnu version 1.6.2
> Expect version  5.45.4
> Tcl version     8.6
Comment 5 Tsukasa OI 2022-10-16 13:00:02 UTC
I found possible cause of non-reproduction as reported by Andrew Burgess.
While a single-threaded run with:

    make -C sim check-DEJAGNU-single

resulted in the failure (correct behavior), a multi-threaded run with:

    make -C sim check-DEJAGNU-parallel

resulted in the unexpected success (as I reproduced).
Possible reason for this is because the rule "check-DEJAGNU-parallel" uses `find .' and runs runtest with pathes containing "./".

If you don't reproduce with `make check-sim', try forcing parallel test with `make -C sim check-DEJAGNU-parallel'.
Comment 6 Mike Frysinger 2022-10-23 20:22:13 UTC
there's a thread on the gdb mailing lists. there's no need to open a bug.  doing so won't get the issue more attention because the same people watch bugailla was watch the mailing list.  please keep the conversation in one place so we don't fork threads/discussions/investigations.
Comment 7 Tsukasa OI 2022-10-24 06:32:50 UTC
Commit 86ef36f655d13cd39ff573de079b35a142f8cf42 is confirmed working. Closing as resolved/fixed.