r-dejagnu-testcase-classify: add own version-insensitive vocabulary class
... essential, as saved Vocab objects cannot be reused across torchtext versions
... and might as well make it somewhat flexible for future tokens in future
training sessions
... and experimenting with a different tokenizing regexp, preserving larger classes of words intact
When testruns underlying clusters are removed via aging or whatnot,
clusters can stick around. (The doubly-linked record chain can simply
skip them, but no FK etc. constraint automatically erases them.) With
--update mode, clean up any that have lost all their members. This
reduces cluster entropy etc. work also.
Frank Ch. Eigler [Fri, 25 Nov 2022 15:00:52 +0000 (10:00 -0500)]
pipeline: drop the analyze
Individual analysis passes run optimize ops as they need to already.
The "analyze" op here has been observed to suck up cpu & I/O, despite
a pragma analysis_limit setting.
Frank Ch. Eigler [Thu, 24 Nov 2022 20:50:55 +0000 (15:50 -0500)]
g-testrun-clusterfinder: rework
Rework so this analysis pass preserves database records as much as
possible, modifying them in place rather than deleting & reinserting.
This dramatically reduces I/O during the routine update operation.
Frank Ch. Eigler [Mon, 14 Nov 2022 16:23:04 +0000 (11:23 -0500)]
pipeline: sort new_testruns
Operate on new testruns in young-to-old order, so that if a pipeline
runs are partial (such as running under timeout control), new content
will dominate.
Also: tweak database size report for KiB/MiB/GiB flexibility.
Also: tweak auto_vacuum setting sequence for effectiveness
Frank Ch. Eigler [Thu, 10 Nov 2022 22:09:20 +0000 (17:09 -0500)]
i-testrun-indexer: move pragma auto_vacuum here
... because in the pipeline process where it previously was, it was
mistimed. This pragma apparently needs to be set in a sqlite client
process that inserts the first rows into a table.
Keith Seitz [Tue, 8 Nov 2022 20:34:41 +0000 (12:34 -0800)]
Fix several bugs in r-dejagnu-diff-logs
This patch fixes several bugs that I encountered while doing some usability
testing:
1. In diff_subtest_lines, the output filter used by the "diff" jinja template,
we run over the list of lines, adding newlines if necessary. If the log
file contains an empty line (simply '\n'), the following error would
result:
File "/usr/local/bin/r-dejagnu-diff-logs", line 56, in <listcomp>
lines2 = [t + '\n' if t[-1] != '\n' else t for t in subtest['commits'][1]['lines']]
~^^^^
IndexError: string index out of range
Since get_log_lines ALWAYS strips newlines, we can safely ALWAYS
add them in diff_subtest_lines.
2. If the user does not specify a subtest or specifies an invalid one,
the script would silently exit. Since all DejaGNU subtests are named
'testfile.exp: subtest', if we don't find a colon, we have malformed
input. Issue an error in this instance.
3. If the user specifies a subtest which is not found in either the
before (or as i call it unpatched) and after (patched) runs,
issue an error informing them that the subtest does not exist.
Serhei Makarov [Tue, 18 Oct 2022 20:45:41 +0000 (16:45 -0400)]
R-show-testcases visual improvements round2
- use symmetric diffs for colour differentiation
- use papaya whip to identify fixed subtests
- highlight newly failing subtests in bold to avoid headach
XXX uglitude of the code is greatly increased, hence round3
will factor out some convenience functions. Since 'grid entry'
is an intelligible item in many conceivable reports, these
might even make it into the any-day-now library.
- FIX oops: first/last swapped
- FIX oops: count only fails
- stabilize sort order for configurations
- colour differentiation for changed/unchanged fail counts
- cluster linkage from the testrun metadata viewer now passes the
destination cluster via keyvalue type parameters instead of
enumerating (potentially very many) commitishes
- added a backup limiting facility to hide cluster-relative diff/regress
links that are likely to fail anyway
- also in the name of user not accidentally getting too much data,
dropped the "all testruns" link from top /testruns/ listing
- added a --has-keyvalue to r-diff-testruns, to identify cluster
to compare to given committishes
- generalized --has-keyvalue[-like,-glob] to --has-keyvalue K V
[--has-keyvalue-op OPERATOR] for r-find-testruns, with = being
preferable for cluster direct naviation links from the web frontend,
so metacharacter quoting doesnt interfere (as it could with %/* in
metadata value); ditched the previous -like and -glob option, which is
a cli break; could bring it back if desirable
Based on experience working with the results for the previous SystemTap
release, the previous ui behaviour where you click anywhere in the
details view to hide the subtests was very suboptimal.
For example, when I wanted to pastebin / diff the contents of a cell,
I found myself trying and failing to copy/paste detail text,
then giving up and hunting for the text in the raw HTML. Not good.
Solved by introducing an explicit 'hide' link and making the table cell
itself only respond to clicks when the details are hidden.
This work enables follow-up patches to add other buttons/links to the
details view (e.g. diff, more).
Unbelievable number of ways to get this seemingly basic DOM edit wrong,
I know because I tried most of them :p
Serhei Makarov [Mon, 29 Aug 2022 21:13:11 +0000 (17:13 -0400)]
r-httpd-browse.in: switch the form to 'glob' instead of 'like'
TODO: A flipflop in the form could be used to set glob/like
options but that would be too complicated. Instead, the 'like'
querystrings are kept for compatibility purposes but not reflected in
the form.
Additional code is needed to make the 'earliest' and 'latest' columns
dynamic (with the earliest/latest tested version in each row)
rather than merely showing the first and last versions in the sequence
that may be empty.
Frank Ch. Eigler [Mon, 22 Aug 2022 22:31:41 +0000 (18:31 -0400)]
r-diff-testruns: add --regressions mode
This limits output to only pass->fail type transitions among the given
testruns. (autoconf config.log entries are not identified as
regressions at all.) Update r-httpd-browse to expose this function at
the testruns list (using maybe cryptic delta / nabla symbolism) and at
the testrun-diffs view as "switch to ...." alternation. Yeah it's a
bit ugly & inconsistent.
Keith Seitz [Thu, 18 Aug 2022 19:32:21 +0000 (12:32 -0700)]
r-dejagnu-summary: use argparse action 'store_true'
This script currently uses argparse.BooleanOptionalArgument, but this
action was only recently introduced in Python 3.9. Older systems such as RHEL8
(with Python 3.6.8) throw an AttributeError with this.
Solve this by returning to 'store_true' which works everywhere.
Keith Seitz [Mon, 15 Aug 2022 16:25:24 +0000 (09:25 -0700)]
r-dejagnu-diff-logs: fix patched-unpatched typo
There is a little typo in get_commits() which could (or will) cause the
user some confusion. The unpatched and patched commits are reversed
because of a typo!
Serhei Makarov [Tue, 9 Aug 2022 16:49:50 +0000 (12:49 -0400)]
WIP: benchmarking grid construction
With judicious use of the slice_testcases feature I can build
release-4.6..now in about 1:45 on my laptop. Previously I used
a beefy server with a big memory, so this could be interpreted
as an improvement.
Serhei Makarov [Mon, 8 Aug 2022 16:04:22 +0000 (12:04 -0400)]
WIP: bugfixes, add 'slicing' performance tradeoff
Since results are generated in alphabetical order of testcase,
the code doesn't strictly need to load *all* the testcase data
into memory at once. This was a performance issue with the
prototype branch version of this script.
This slows down report generation but allows it to be done on smaller
systems with memory constraints. Such as my laptop.
r-httpd-browse: soil from pure source executability, for a good cause
- r-httpd-browse is now autoconf'd, to give it an autoconf-time git-version info footer
->> now a "make" is required to get a runnable script, sorry :(
- added a baby css endpoint
- updated all tables to emit extra class="" attributes to make tables easier on the eyes
pipeline: add options to control engine selection/sequencing
[--i-engine ENGINE] and [--g-engine ENGINE] allow the user to control
which if any engines are run. This is useful in case an engine
evolves to extract more data (oh hai autoconflog), so we want to
--redo their runs only, but not run any other engines.
Serhei Makarov [Tue, 2 Aug 2022 18:36:17 +0000 (14:36 -0400)]
WIP early code: cache redundant Git lookups
Not caching the redundant git lookups for just a couple of refspecs
absolutely destroyed performance for building the version history.
Conversely, caching them un-destroys performance.
i-automake-parser: support sim/ relocation of test-suite.log
For the sim/ (sibling of gdb) project, there is a small automake
testsuite alongside the dejagnu one, but it moved the test-suite.log
trigger file somewhere farther than this engine expected. We now
accept trs/log files anywhere at or beneath the directory where the
test-suite.log file happens to be sitting.
Keith Seitz [Tue, 26 Jul 2022 16:04:07 +0000 (09:04 -0700)]
Add subtest log diff script
This patch introduces a new script, r-dejagnu-diff-logs, which will output
various styles of diffs for a given subtest between two commitishes.
I use this when doing regression testing analysis to quickly see how results
between test runs has changed. It is much faster and more convenient than
opening logfiles in a pager, for example.
Supported output templates are:
1) text
$ r-dejagnu-diff-logs unpatched patched 'gdb.base/included.exp: list integer'
commitish: unpatched: gdb.base/included.exp: list integer (FAIL)
< info source
< Current source file is ../../../src/gdb/testsuite/gdb.base/included.c
< Compilation directory is /home/keiths/work/gdb/branches/amd-v2-may-16/linux/gdb/testsuite
< Located in /home/keiths/work/gdb/branches/amd-v2-may-16/src/gdb/testsuite/gdb.base/included.c
< Contains 24 lines.
< Source language is c.
< Producer is clang version 14.0.0 (Fedora 14.0.0-1.fc36).
< Compiled with DWARF 5 debugging format.
< Includes preprocessor macro info.
< (gdb) list integer
< 18 #include "included.h"
< (gdb) FAIL: gdb.base/included.exp: list integer
commitish: patched: gdb.base/included.exp: list integer (PASS)
> info source
> Current source file is ../../../src/gdb/testsuite/gdb.base/included.c
> Compilation directory is /home/keiths/work/gdb/branches/amd-v2-may-16/linux/gdb/testsuite
> Located in /home/keiths/work/gdb/branches/amd-v2-may-16/src/gdb/testsuite/gdb.base/included.c
> Contains 24 lines.
> Source language is c.
> Producer is clang version 14.0.0 (Fedora 14.0.0-1.fc36).
> Compiled with DWARF 5 debugging format.
> Includes preprocessor macro info.
> (gdb) list integer
> 18 int integer;
> (gdb) PASS: gdb.base/included.exp: list integer
2) json
$ r-dejagnu-diff-logs --template json unpatched patched \
'gdb.base/included.exp: list integer' | jq
[
{
"commits": [
{
"commitish": "unpatched",
"lines": [
"info source",
"Current source file is ../../../src/gdb/testsuite/gdb.base/included.c",
"Compilation directory is /home/keiths/work/gdb/branches/amd-v2-may-16/linux/gdb/testsuite",
"Located in /home/keiths/work/gdb/branches/amd-v2-may-16/src/gdb/testsuite/gdb.base/included.c",
"Contains 24 lines.",
"Source language is c.",
"Producer is clang version 14.0.0 (Fedora 14.0.0-1.fc36).",
"Compiled with DWARF 5 debugging format.",
"Includes preprocessor macro info.",
"(gdb) list integer",
"18\t#include \"included.h\"",
"(gdb) FAIL: gdb.base/included.exp: list integer"
],
"outcome": "FAIL"
},
{
"commitish": "patched",
"lines": [
"info source",
"Current source file is ../../../src/gdb/testsuite/gdb.base/included.c",
"Compilation directory is /home/keiths/work/gdb/branches/amd-v2-may-16/linux/gdb/testsuite",
"Located in /home/keiths/work/gdb/branches/amd-v2-may-16/src/gdb/testsuite/gdb.base/included.c",
"Contains 24 lines.",
"Source language is c.",
"Producer is clang version 14.0.0 (Fedora 14.0.0-1.fc36).",
"Compiled with DWARF 5 debugging format.",
"Includes preprocessor macro info.",
"(gdb) list integer",
"18\tint integer;",
"(gdb) PASS: gdb.base/included.exp: list integer"
],
"outcome": "PASS"
}
],
"name": "gdb.base/included.exp: list integer"
}
]
3) diff
$ r-dejagnu-diff-logs --template diff unpatched patched \
'gdb.base/included.exp: list integer'
*** unpatched: gdb.base/included.exp: list integer
WIP early code: R-show-testcases matching +show-testcases from the old branch
For the time being, I'm mired in implementing the various version-range
selection options that came out of mcermak's use of +show_testcases.
Since the goal of this port is to ditch the current main branch
of Bunsen and promote fche/bunsenql to the new main branch prior
to Cauldron, this work isn't really skippable anyways.
The lack of library facilities is annoying but progress is ongoing.
Pushing 'pastebin' quality early code is also annoying but some
visibility is needed into this work by other people.
g-testrun-clusterfinder: admit defeat against sqlite foreign keys
The testrun_cluster rows form a doubly linked list via the next/prev columns.
During a mass delete, this means O(N^2) or worse processing time. Drop the
foreign key designations and do a more blunt mass delete at the beginning of
the --redo/--update operation. Preexisting bunsen databases should drop
the testrun_cluster table (one time) to migrate.