Serhei Makarov [Tue, 9 Aug 2022 16:49:50 +0000 (12:49 -0400)]
WIP: benchmarking grid construction
With judicious use of the slice_testcases feature I can build
release-4.6..now in about 1:45 on my laptop. Previously I used
a beefy server with a big memory, so this could be interpreted
as an improvement.
Serhei Makarov [Mon, 8 Aug 2022 16:04:22 +0000 (12:04 -0400)]
WIP: bugfixes, add 'slicing' performance tradeoff
Since results are generated in alphabetical order of testcase,
the code doesn't strictly need to load *all* the testcase data
into memory at once. This was a performance issue with the
prototype branch version of this script.
This slows down report generation but allows it to be done on smaller
systems with memory constraints. Such as my laptop.
Serhei Makarov [Tue, 2 Aug 2022 18:36:17 +0000 (14:36 -0400)]
WIP early code: cache redundant Git lookups
Not caching the redundant git lookups for just a couple of refspecs
absolutely destroyed performance for building the version history.
Conversely, caching them un-destroys performance.
WIP early code: R-show-testcases matching +show-testcases from the old branch
For the time being, I'm mired in implementing the various version-range
selection options that came out of mcermak's use of +show_testcases.
Since the goal of this port is to ditch the current main branch
of Bunsen and promote fche/bunsenql to the new main branch prior
to Cauldron, this work isn't really skippable anyways.
The lack of library facilities is annoying but progress is ongoing.
Pushing 'pastebin' quality early code is also annoying but some
visibility is needed into this work by other people.
g-testrun-clusterfinder: admit defeat against sqlite foreign keys
The testrun_cluster rows form a doubly linked list via the next/prev columns.
During a mass delete, this means O(N^2) or worse processing time. Drop the
foreign key designations and do a more blunt mass delete at the beginning of
the --redo/--update operation. Preexisting bunsen databases should drop
the testrun_cluster table (one time) to migrate.
Keith Seitz [Thu, 30 Jun 2022 16:19:58 +0000 (12:19 -0400)]
bunsenql: Add r-dejagnu-summary
This patch adds a bunsenql equivalent of my previous "bunsen +summarize"
script. This can be used to output DejaGNU-like summaries for tests.
This script supports limiting tests based on glob expression (although
it does not support multiple glob expressions like +summarize) and
"verbose" output which will output all (sub)test results. It supports both
text and JSON output templates.
The script does not use any of the pipeline-created summaries. It computes
results directly from the data. This is useful to verify that results were
properly imported and could be used to compare against the DejaGNU
.sum file's summary section.
Frank Ch. Eigler [Tue, 28 Jun 2022 20:15:08 +0000 (16:15 -0400)]
g-dejagnu-cluster-entropy rework
Change logic so that freshness is tracked in a separate new table.
This allows full caching even if no cluster/expfile data would be
computed (due to no expfiles in a particular cluster, e.g.), and
happens to reduce storage requirements too from the previous v2.
The previous entropy payload schema is restored.
Frank Ch. Eigler [Thu, 23 Jun 2022 23:12:02 +0000 (19:12 -0400)]
pipeline: disable entropy calculation again
On rhel8's old sqlite 3.26, a pretty pessimal query evaluation
strategy is chosen to run the inner query, and it slows things down by
orders of magnitude compared to f35's sqlite 3.36. Disable again
awhile.
Frank Ch. Eigler [Thu, 23 Jun 2022 22:21:17 +0000 (18:21 -0400)]
omnibus g-engine changes
- add "--update" option to g-* engines to refresh rather than start over, if possible
- pipeline uses --update for them
- pipeline process exits with rc != 0 in case of engine/etc. errors
- g-dejagnu-cluster-entropy: API BREAK table schema change, add a cluster membership-hash
to output table, to know if elements need to be recomputed (if clusterfinder changed
members); saves 85% time for incremental work situations; this engine could also become
hypothetically i-* series, being given the new testrun hashes only to focus on, except
that it consumes the clusterfinder's output
Frank Ch. Eigler [Mon, 20 Jun 2022 23:43:35 +0000 (19:43 -0400)]
r-httpd-browse: add cluster navigation to testrun view
Add an arrow and a delta (plus a count) for metadata-induced
clustering, for prev, this, and next clusters. This allows the user
to navigate between and within testrun clusters, and also run
testsuite-diff operations between the current testrun and related
clusters of testruns.
For better or for worse, some of these clusters can be quite large,
which means the the diff operations can be quite expensive & produce
large results. Some throttling limits will be needed shortly.
Serhei Makarov [Tue, 14 Jun 2022 15:38:02 +0000 (11:38 -0400)]
r-httpd-browse icebreaker: report empty sets of things
A minor thing, but this was a factor in my confusion when I ran
r-httpd-browse on an old sqlite db with keyval authored_day
and got an empty query that was trying to sort on testrun.authored.day.
As keiths & serhei have long ago discovered and we rediscovered now,
make -j check
type dejagnu runs can regularly scramble .sum / .log file segments so
they no longer parallel. One offender is gcc's
contrib/dg-extract-results.sh that assembles .log/.sum files from
parallel-executed pieces .... and proceeds to SORT rows within .sum
file .exp segments (only). Why? Why? WHYYYYYY?! y tho
That resulted in many logfile cursor = NULL results. Tweak
i-dejagnu-parser to tolerate this case better by pre-parsing segments
of the .exp file loglines. The logic here may break if some wacky
tool reorders the sequence of .exp files, but this has not been
observed.
As keiths has long ago discovered and we rediscovered now,
make -j check
type dejagnu runs can regularly scramble .sum / .log file segments so
they no longer parallel. That resulted in many logfile cursor = NULL
results. Tweak i-dejagnu-parser to tolerate this case better by
willing to restart a search for a .log file snippet from the top of
the file.
- add a predicate-filtering frontend for testrun searches
- add tests
- coincidentally (sorry), do some toolshedding cleanup on i-testrun-indexer
metadata extraction; add back testrun.git_describe for the nickname