Frank Ch. Eigler [Tue, 21 Feb 2023 12:57:51 +0000 (07:57 -0500)]
r-dejagnu-testcase-classify: add a nonlinear layer for better training
Also, detect training of such excess that any nn weights go NaN, and
reject saving ... OTOH by that time it's too late, the network is
damaged (has weights in the 1e20 or somesuch crazy range) and cannot
be trained further. In loglevel=debug mode, also print some basic nn
parameter statistics, so as to track value range evolution.
Frank Ch. Eigler [Sat, 11 Feb 2023 15:19:26 +0000 (10:19 -0500)]
r-dejagnu-testcase-classify: robustify for large jobs
- reject loading or saving models where NaN's pop up in the weights;
it's unknown what sometimes causes this, but at least the disk model
files should be immune
- add a batching option so that all the testruns processed in
any given epoch may be batched to some at a time for loading & training;
this is necessary to manage gpu/cpu ram usage
- tweak exception catching to respond rapidly to single ^Cs, whenever
Frank Ch. Eigler [Fri, 10 Feb 2023 18:26:06 +0000 (13:26 -0500)]
r-httpd-browse: for chrome browsers with javascript on, make top level /testruns/ page include ALL
Chrome (and not firefox) seem able to handle the testrun table with
O(100000) entries as on sourceware, with a bit of css
display:none->table switcharoo. This lets the user page through the
entire list, sort & search, through the newfangled jstable mechanism.
Non-javascript and non-chrome users still default to 1000 rows at a
time. (Use querystring &limit=0 to unlimit them manually.)
add data/ directory, with files for installation under $pkgdatadir
Populate it with auxiliary assets for r-httpd-browse. Most notably: a
copy of the distributables from version 1.6.3 of JSTable
https://jstable.github.io/, which is a MIT-licensed javascript library
for adding sort/search gadgets to html tables.
r-dejagnu-testcase-classify: add own version-insensitive vocabulary class
... essential, as saved Vocab objects cannot be reused across torchtext versions
... and might as well make it somewhat flexible for future tokens in future
training sessions
... and experimenting with a different tokenizing regexp, preserving larger classes of words intact
When testruns underlying clusters are removed via aging or whatnot,
clusters can stick around. (The doubly-linked record chain can simply
skip them, but no FK etc. constraint automatically erases them.) With
--update mode, clean up any that have lost all their members. This
reduces cluster entropy etc. work also.
Frank Ch. Eigler [Fri, 25 Nov 2022 15:00:52 +0000 (10:00 -0500)]
pipeline: drop the analyze
Individual analysis passes run optimize ops as they need to already.
The "analyze" op here has been observed to suck up cpu & I/O, despite
a pragma analysis_limit setting.
Frank Ch. Eigler [Thu, 24 Nov 2022 20:50:55 +0000 (15:50 -0500)]
g-testrun-clusterfinder: rework
Rework so this analysis pass preserves database records as much as
possible, modifying them in place rather than deleting & reinserting.
This dramatically reduces I/O during the routine update operation.
Frank Ch. Eigler [Mon, 14 Nov 2022 16:23:04 +0000 (11:23 -0500)]
pipeline: sort new_testruns
Operate on new testruns in young-to-old order, so that if a pipeline
runs are partial (such as running under timeout control), new content
will dominate.
Also: tweak database size report for KiB/MiB/GiB flexibility.
Also: tweak auto_vacuum setting sequence for effectiveness
Frank Ch. Eigler [Thu, 10 Nov 2022 22:09:20 +0000 (17:09 -0500)]
i-testrun-indexer: move pragma auto_vacuum here
... because in the pipeline process where it previously was, it was
mistimed. This pragma apparently needs to be set in a sqlite client
process that inserts the first rows into a table.
Keith Seitz [Tue, 8 Nov 2022 20:34:41 +0000 (12:34 -0800)]
Fix several bugs in r-dejagnu-diff-logs
This patch fixes several bugs that I encountered while doing some usability
testing:
1. In diff_subtest_lines, the output filter used by the "diff" jinja template,
we run over the list of lines, adding newlines if necessary. If the log
file contains an empty line (simply '\n'), the following error would
result:
File "/usr/local/bin/r-dejagnu-diff-logs", line 56, in <listcomp>
lines2 = [t + '\n' if t[-1] != '\n' else t for t in subtest['commits'][1]['lines']]
~^^^^
IndexError: string index out of range
Since get_log_lines ALWAYS strips newlines, we can safely ALWAYS
add them in diff_subtest_lines.
2. If the user does not specify a subtest or specifies an invalid one,
the script would silently exit. Since all DejaGNU subtests are named
'testfile.exp: subtest', if we don't find a colon, we have malformed
input. Issue an error in this instance.
3. If the user specifies a subtest which is not found in either the
before (or as i call it unpatched) and after (patched) runs,
issue an error informing them that the subtest does not exist.
Serhei Makarov [Tue, 18 Oct 2022 20:45:41 +0000 (16:45 -0400)]
R-show-testcases visual improvements round2
- use symmetric diffs for colour differentiation
- use papaya whip to identify fixed subtests
- highlight newly failing subtests in bold to avoid headach
XXX uglitude of the code is greatly increased, hence round3
will factor out some convenience functions. Since 'grid entry'
is an intelligible item in many conceivable reports, these
might even make it into the any-day-now library.
- FIX oops: first/last swapped
- FIX oops: count only fails
- stabilize sort order for configurations
- colour differentiation for changed/unchanged fail counts
- cluster linkage from the testrun metadata viewer now passes the
destination cluster via keyvalue type parameters instead of
enumerating (potentially very many) commitishes
- added a backup limiting facility to hide cluster-relative diff/regress
links that are likely to fail anyway
- also in the name of user not accidentally getting too much data,
dropped the "all testruns" link from top /testruns/ listing
- added a --has-keyvalue to r-diff-testruns, to identify cluster
to compare to given committishes
- generalized --has-keyvalue[-like,-glob] to --has-keyvalue K V
[--has-keyvalue-op OPERATOR] for r-find-testruns, with = being
preferable for cluster direct naviation links from the web frontend,
so metacharacter quoting doesnt interfere (as it could with %/* in
metadata value); ditched the previous -like and -glob option, which is
a cli break; could bring it back if desirable
Based on experience working with the results for the previous SystemTap
release, the previous ui behaviour where you click anywhere in the
details view to hide the subtests was very suboptimal.
For example, when I wanted to pastebin / diff the contents of a cell,
I found myself trying and failing to copy/paste detail text,
then giving up and hunting for the text in the raw HTML. Not good.
Solved by introducing an explicit 'hide' link and making the table cell
itself only respond to clicks when the details are hidden.
This work enables follow-up patches to add other buttons/links to the
details view (e.g. diff, more).
Unbelievable number of ways to get this seemingly basic DOM edit wrong,
I know because I tried most of them :p
Serhei Makarov [Mon, 29 Aug 2022 21:13:11 +0000 (17:13 -0400)]
r-httpd-browse.in: switch the form to 'glob' instead of 'like'
TODO: A flipflop in the form could be used to set glob/like
options but that would be too complicated. Instead, the 'like'
querystrings are kept for compatibility purposes but not reflected in
the form.
Additional code is needed to make the 'earliest' and 'latest' columns
dynamic (with the earliest/latest tested version in each row)
rather than merely showing the first and last versions in the sequence
that may be empty.
Frank Ch. Eigler [Mon, 22 Aug 2022 22:31:41 +0000 (18:31 -0400)]
r-diff-testruns: add --regressions mode
This limits output to only pass->fail type transitions among the given
testruns. (autoconf config.log entries are not identified as
regressions at all.) Update r-httpd-browse to expose this function at
the testruns list (using maybe cryptic delta / nabla symbolism) and at
the testrun-diffs view as "switch to ...." alternation. Yeah it's a
bit ugly & inconsistent.
Keith Seitz [Thu, 18 Aug 2022 19:32:21 +0000 (12:32 -0700)]
r-dejagnu-summary: use argparse action 'store_true'
This script currently uses argparse.BooleanOptionalArgument, but this
action was only recently introduced in Python 3.9. Older systems such as RHEL8
(with Python 3.6.8) throw an AttributeError with this.
Solve this by returning to 'store_true' which works everywhere.
Keith Seitz [Mon, 15 Aug 2022 16:25:24 +0000 (09:25 -0700)]
r-dejagnu-diff-logs: fix patched-unpatched typo
There is a little typo in get_commits() which could (or will) cause the
user some confusion. The unpatched and patched commits are reversed
because of a typo!