Bug 31575 - [gdb/external, linaro CI] progressions and regressions reported on unrelated commits
Summary: [gdb/external, linaro CI] progressions and regressions reported on unrelated ...
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: external (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-03-29 08:01 UTC by Tom de Vries
Modified: 2024-03-29 10:43 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
Project(s) to access:
ssh public key:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2024-03-29 08:01:05 UTC
I submitted a patch series with 3 test-case fixes here ( https://sourceware.org/pipermail/gdb-patches/2024-March/207598.html ).

Then I got the following message from the linaro CI:
...
Dear contributor, our automatic CI has detected problems related to your patch(es).  Please find some details below.  If you have any questions, please follow up on linaro-toolchain@lists.linaro.org mailing list, Libera's #linaro-tcwg channel, or ping your favourite Linaro toolchain developer on the usual project channel.

We appreciate that it might be difficult to find the necessary logs or reproduce the issue locally. If you can't get what you need from our CI within minutes, let us know and we will be happy to help.

In gdb_check master-arm after:

  | 2 patches in gdb
  | Patchwork URL: https://patchwork.sourceware.org/patch/87702
  | 0ba222c3fdd Fix missing return type in gdb.linespec/break-asm-file.c
  | fd739928eeb Add missing include in gdb.base/ctf-ptype.c
  | ... applied on top of baseline commit:
  | b58829cdeff x86/SSE2AVX: move checking

FAIL: 8 regressions: 16 progressions

regressions.sum:
		=== gdb tests ===

Running gdb:gdb.ada/verylong.exp ...
FAIL: gdb.ada/verylong.exp: print (x / 4) * 2
FAIL: gdb.ada/verylong.exp: print +x
FAIL: gdb.ada/verylong.exp: print -x
FAIL: gdb.ada/verylong.exp: print x
FAIL: gdb.ada/verylong.exp: print x - 99 + 1
FAIL: gdb.ada/verylong.exp: print x / 2
FAIL: gdb.ada/verylong.exp: print x = 170141183460469231731687303715884105727
... and 4 more entries

progressions.sum:
		=== gdb tests ===

Running gdb:gdb.ada/convvar_comp.exp ...
FAIL: gdb.ada/convvar_comp.exp: print $item.started
FAIL: gdb.ada/convvar_comp.exp: set variable $item := item
FAIL: gdb.ada/convvar_comp.exp: print item.started

Running gdb:gdb.ada/enum_idx_packed.exp ...
FAIL: gdb.ada/enum_idx_packed.exp: scenario=minimal: print multi'first
FAIL: gdb.ada/enum_idx_packed.exp: scenario=minimal: print multi_multi
... and 18 more entries

You can find the failure logs in *.log.1.xz files in
 - https://ci.linaro.org/job/tcwg_gdb_check--master-arm-precommit/2021/artifact/artifacts/artifacts.precommit/00-sumfiles/
The full lists of regressions and progressions as well as configure and make commands are in
 - https://ci.linaro.org/job/tcwg_gdb_check--master-arm-precommit/2021/artifact/artifacts/artifacts.precommit/notify/
The list of [ignored] baseline and flaky failures are in
 - https://ci.linaro.org/job/tcwg_gdb_check--master-arm-precommit/2021/artifact/artifacts/artifacts.precommit/sumfiles/xfails.xfail

The configuration of this build is:
CI config tcwg_gdb_check master-arm

-----------------8<--------------------------8<--------------------------8<--------------------------
The information below can be used to reproduce a debug environment:

Current build   : https://ci.linaro.org/job/tcwg_gdb_check--master-arm-precommit/2021/artifact/artifacts
Reference build : https://ci.linaro.org/job/tcwg_gdb_check--master-arm-build/983/artifact/artifacts

Warning: we do not enable maintainer-mode nor automatically update
generated files, which may lead to failures if the patch modifies the
master files.
...

Clearly the test-cases for which regressions and progressions are claimed have nothing to do with the submitted patches, which touch two unrelated .c files in the test suite.

I managed to reproduce the FAILs in gdb.ada/verylong.exp, and filed PR31574.
Comment 1 Tom de Vries 2024-03-29 08:08:43 UTC
Is there something that can be done about these false positives?
Comment 2 Maxim Kuvyrkov 2024-03-29 08:13:25 UTC
Hi Tom,

Yes, we have noticed the problem this morning.  Working on it.
Comment 3 Tom de Vries 2024-03-29 08:14:11 UTC
(In reply to Maxim Kuvyrkov from comment #2)
> Hi Tom,
> 
> Yes, we have noticed the problem this morning.  Working on it.

Thanks :)
Comment 4 Maxim Kuvyrkov 2024-03-29 08:29:49 UTC
This was caused by a transition side-effect.

We have recently fixed install of system gnat compiler, which enabled more GDB tests to run.  Unfortunately, the baseline run didn't have the fix, so baseline results didn't have the additional tests (and additional FAILs!):
UNSUPPORTED: gdb.ada/verylong.exp: require failed: gnatmake_version_at_least 11

The pre-commit builds picked up the fix, enabled additional testsuites, and reported additional FAILs as regressions.

We'll monitor our CI today, and investigate further if we see more failures.
Comment 5 Tom de Vries 2024-03-29 10:43:54 UTC
(In reply to Maxim Kuvyrkov from comment #4)
> This was caused by a transition side-effect.
> 
> We have recently fixed install of system gnat compiler, which enabled more
> GDB tests to run.  Unfortunately, the baseline run didn't have the fix, so
> baseline results didn't have the additional tests (and additional FAILs!):
> UNSUPPORTED: gdb.ada/verylong.exp: require failed: gnatmake_version_at_least
> 11
> 
> The pre-commit builds picked up the fix, enabled additional testsuites, and
> reported additional FAILs as regressions.
> 
> We'll monitor our CI today, and investigate further if we see more failures.

Sounds like a mechanism is required to ensure that only comparable results are compared.