[PATCH] string: Add page tests for test-strncasecmp and test-strncpy
Raphael M Zinsly
rzinsly@linux.ibm.com
Mon Jun 1 18:53:35 GMT 2020
Thanks for the review, it helped me find some bugs, I'll fix it and add
more comments to the code:
On 29/05/2020 18:13, Paul A. Clarke wrote:
> On Fri, May 29, 2020 at 10:53:05AM -0300, Raphael Moreira Zinsly via Libc-alpha wrote:
>> Adds tests to check if strings placed at page bondaries are
>> handled correctly by strncasecmp and strncpy similar to tests
>> for strncmp and strnlen. This should catch regressions in the
>> optmized functions.
>> ---
> [snip]
>> diff --git a/string/test-strncasecmp.c b/string/test-strncasecmp.c
>> index 6a9c27beae..0399f37117 100644
>> --- a/string/test-strncasecmp.c
>> +++ b/string/test-strncasecmp.c
>> @@ -137,6 +137,44 @@ do_test (size_t align1, size_t align2, size_t n, size_t len, int max_char,
>
> These tests could use some comments, since there seems to be some
> arbitrary choices, making it difficult to determine what is being
> tested...
>
>> +static void
>> +do_page_tests (void)
>> +{
>> + size_t i, offset1, offset2;
>> + char *s1, *s2, *s3;
>> + int exp_result;
>> +
>> + s1 = (char *) buf1;
>> + for (i = 0; i < page_size - 1; i++)
>> + s1[i] = 23;
>
> Magic number. Does this 23 have any significance?
No, but it actually would be better to use memset and '\1' as I did for
the strncpy test.
>
>> + s1[i] = 0;
>> +
>> + s3 = strdup (s1);
>> + offset2 = 2636;
>
> Magic number. How was it chosen and why?
These offsets were picked from the strncmp test, that came originally
from https://sourceware.org/bugzilla/show_bug.cgi?id=12597
>
>> +
>> + for (i = 0; i < 64; ++i)
>
> Does 64 have some signficance?
> It might be a cache line length, in that you are trying to cross
> the page boundary at every offset in a cache line? And if that's
> the case, maybe 128 would be better, since it covers more
> architectures? (Or, maybe even query the cache line size?)
Yes, I liked the suggestion of using 128.
>
>> + {
>> + offset1 = 3988 + i;
>
> Another magic number >
>> +
>> + if (offset1 >= page_size
>> + || offset2 >= page_size)
>> + return;
>
> Maybe a comment before this stating something like
> "make sure we don't run past the end of the buffer"?
>
> Also, offset2 is fixed, so that part of this test need
> not be in the loop. And, if you pick a good offset2,
> as I suggest below, there is no need to test it at all.
> If the constant to which offset2 is set just happens to
> be greater than page_size, then this entire test does
> nothing.
There is a bug here, it should be "continue" instead of return.
>
>> +
>> + s1 = (char *) buf1;
>> + s2 = s3;
>
> Maybe a comment "reset pointers to base addresses" or the like?
>
>> + s1 += offset1;
>> + s2 += offset2;
>
> So, offset1 ranges from [3988..3988+63] ? offset2 is fixed at 2636.
> Since offset2 is fixed, why not just set "s3 = s2 + offset2"
> before the loop and then this increment is unnecessary.
Agreed, thanks for the suggestion.
>
> Maybe we need to note somewhere that "page_size" is actually
> *2* pages and that doing a strncasecmp from an offset which is
> less than a full page to page_size is guaranteed to cross a page
> boundary.
> It might be a good idea to write code that ensures that the
> offsets are both indeed less than a single page size.
> Something like "offset2 = page_size / 4" and change the
> 3988 to something relative to page_size ("3 * page_size / 4"?).
>
Agreed.
>> +
>> + exp_result= *s1;
>
> strncasecmp man page says it returns "an integer less than, equal to,
> or greater than zero". It doesn't seem to be appropriate to look for
> a specific value.
This is a valid point, but all other functions besides do_random_tests
look for a specific value, I'm just keeping the behavior.
>
>> +
>> + FOR_EACH_IMPL (impl, 0)
>> + {
>> + check_result (impl, s1, s2, page_size, -exp_result);
>> + check_result (impl, s2, s1, page_size, exp_result);
>> + }
>> + }
>> +}
> [snip]
>> diff --git a/string/test-strncpy.c b/string/test-strncpy.c
>> index c978753ad8..30c69dd34b 100644
>> --- a/string/test-strncpy.c
>> +++ b/string/test-strncpy.c
>> @@ -155,6 +155,42 @@ do_test (size_t align1, size_t align2, size_t len, size_t n, int max_char)
> [snip]
>> +static void
>> +do_page_tests (void)
>> +{
>> + size_t i, len, start_offset, offset;
>> + CHAR *s1, *s2;
>> + /* Calculate the null character offset. */
>> + size_t last_offset = (page_size / sizeof (CHAR)) - 1;
>> +
>> + s2 = (CHAR *) buf1;
>> + s1 = (CHAR *) buf2;
>> + MEMSET (s1, '\1', last_offset);
>> + s1[last_offset] = '\0';
>> +
>> + /* Place short strings ending at page boundary. */
>> + offset = last_offset;
>> + for (i = 0; i < 128; i++)
>
> Is 128 enough? Why was this chosen?
It should be SMALL_CHAR.
>
>> + {
>> + offset--;
>> + len = last_offset - offset;
>> + FOR_EACH_IMPL (impl, 0)
>> + do_one_test (impl, s2, (CHAR *) (s1 + offset), len, len);
>> + }
>> +
>> + /* Place long strings ending at page boundary. */
>> + start_offset = (last_offset + 1) / 2;
>> + for (i = 0; i < 64; ++i)
>
> Is 64 enough? Why was this chosen?
This was intended to test each alignment in a cache line, it could use
128 to cover more architectures as suggested for test-strncacecmp.
>
>> + {
>> + offset = start_offset + i;
>> + if (offset >= (last_offset + 1))
>> + break;
>> + len = last_offset - offset;
>> + FOR_EACH_IMPL (impl, 0)
>> + do_one_test (impl, s2, (CHAR *) (s1 + offset), page_size, len);
>> + }
>
> Would it be bad to collapse these 2 loops into one that tests every
> length from 1..page_size?
We could collapse both into one loop with 128 iterations, I don't think
using page_size would be bad but we would cover many alignments that are
already covered by other tests.
>
>> +}
> [snip]
>
> PC
>
Best regards,
--
Raphael Moreira Zinsly
IBM
Linux on Power Toolchain
More information about the Libc-alpha
mailing list