Are big tests allowed in binutils?

Tue Jun 21 15:32:15 GMT 2022

On 17/06/2022 06:11, Dmitry Selyutin via Binutils wrote:
> Hi folks,
> 
> I'm going to submit a series of patches which add a couple of new
> instructions available as SVP64 extensions (under -mlibresoc switch).
> Since I want to be a good citizen, I also want to introduce tests for
> each instruction, basically comparing the expected results for dumps
> (aka run_dump_test). However, I'd like to cover various possible
> cases, so I generated a test to cover many different operands; this
> makes tests quite big. For example, the assembly listing for a new
> svremap instruction takes 10240 lines, and etalon dump takes a bit
> more lines. So, questions:
> 1. Is it OK at all to have tests that large? Time-wise they're fast,
> I'm mostly concerned about code base size and readers' patience.
> 2. Will you tolerate it if I post these tests in the mailing list?
> These will be a major headache for any reader if integrated into the
> patches.
> 3. If it's OK to post these tests into the mailing lists, is it
> allowed to post them as separate patches from the ones which add the
> instructions?
> 
> My preferred option would be to split each patch into two parts:
> 1. The patch which adds the instruction itself.
> 2. The patch which adds the test for this instruction.
> 
> However, looking at patches in PowerPC, I see that the tests generally
> come along with the implementation in scope of one patch. I'm afraid
> this might make patches barely readable.
> If needed, I can cut the size of tests; time-wise it's fast, 10420
> lines of asm take 0,04s of real time, I'm only concerned about
> readability.
> 
> I'd like to know your opinion on this. Thank you!
> 

Good testing is about being smart.  Tests get rerun many times and 
although each run is not too expensive, over time it adds up.

Furthermore, most instructions in an architecture often follow standard 
'shapes' in order to make decoding fast for the hardware.  So registers 
numbers are often encoded in consecutive bits at well-known places 
within the instruction.  You can use this knowledge to significantly 
reduce the number of tests that need to be written.

For example, if you have an instruction FOO that takes two 4-bit 
register numbers (a 16-register architecture), and register r0 is all 
bits zero and r15 is all bits one, you might have a test

	FOO r0, r0
	FOO r0, r15
	FOO r15, r0

These three tests would first check that the base opcode for FOO is 
correct, the second that the register bits for the second register are 
all encoded into the right spot and the third that the register bits for 
the first register are placed correctly.  That's three tests rather than 
256 that would be needed to test all register permutations.

For a select number of instructions you might check all register numbers 
are correctly encoded, but it shouldn't be necessary to do that for 
every instruction - the code that inserts the bits is probably common to 
all related instructions and you'd just be repeatedly testing that with 
no added value.

The same is also true for literal values encoded in instructions: only a 
few tests are likely to stress the boundaries of your parsing and 
encoding, so focus on testing those boundaries.

This obviously won't work for every instruction.  Sometimes there are 
encoding restrictions, for example you can't both read and write a 
register in the same instruction.  But most of the time careful choice 
of what you test can significantly reduce the size of the testsuite and 
that in turn can make tracking down a problem much easier when you do 
get a regression.

R.