V2 [PATCH 1/4] strncmp: Add a testcase for page boundary [BZ #25933]

Paul A. Clarke pc@us.ibm.com
Wed Jun 17 16:02:53 GMT 2020


On Tue, Jun 16, 2020 at 02:30:37PM -0700, H.J. Lu via Libc-alpha wrote:
> On Tue, Jun 16, 2020 at 2:21 PM Paul A. Clarke <pc@us.ibm.com> wrote:
> >
> > On Tue, Jun 16, 2020 at 01:42:04PM -0700, H.J. Lu via Libc-alpha wrote:
> > > On Tue, Jun 16, 2020 at 11:25 AM Paul A. Clarke <pc@us.ibm.com> wrote:
> > > >
> > > > On Mon, Jun 15, 2020 at 04:06:58PM -0700, H.J. Lu via Libc-alpha wrote:
> > > > > On Mon, Jun 15, 2020 at 3:03 PM Paul A. Clarke <pc@us.ibm.com> wrote:
> > > > > >
> > > > > > On Mon, Jun 15, 2020 at 02:34:13PM -0700, H.J. Lu via Libc-alpha wrote:
> > > > > > > On Mon, Jun 15, 2020 at 1:29 PM Paul A. Clarke <pc@us.ibm.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, Jun 12, 2020 at 01:10:53PM -0700, H.J. Lu via Libc-alpha wrote:
> > > > > > > > > Add a strncmp testcase to cover cases where one of strings ends on the
> > > > > > > > > page boundary.
> > > > > > > > > ---
> > > > > > > > >  string/test-strncmp.c | 25 +++++++++++++++++++++++++
> > > > > > > > >  1 file changed, 25 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/string/test-strncmp.c b/string/test-strncmp.c
> > > > > > > > > index d961ac4493..d0928a2864 100644
> > > > > > > > > --- a/string/test-strncmp.c
> > > > > > > > > +++ b/string/test-strncmp.c
> > > > > > > > > @@ -403,6 +403,30 @@ check2 (void)
> > > > > > > > >    free (s2);
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +static void
> > > > > > > > > +check3 (void)
> > > > > > > > > +{
> > > > > > > > > +  size_t size = 32 * 4;
> > > > > > > > > +  CHAR *s1 = (CHAR *) (buf1 + (BUF1PAGES - 1) * page_size);
> > > > > > > > > +  CHAR *s2 = (CHAR *) (buf2 + (BUF1PAGES - 1) * page_size);
> > > > > > > > > +  int exp_result;
> > > > > > > > > +
> > > > > > > > > +  memset (s1, 'a', page_size);
> > > > > > > > > +  memset (s2, 'a', page_size);
> > > > > > > > > +  s1[(page_size / CHARBYTES) - 1] = (CHAR) 0;
> > > > > > > > > +
> > > > > > > > > +  for (size_t s = 99; s <= size; s++)
> > > > s 99..128
> > > > > > > > > +    for (size_t s1a = 31; s1a < 32; s1a++)
> > > > s1a = 31
> > > > > > > > > +      for (size_t s2a = 30; s2a < 32; s2a++)
> > > > s2a = 30,31
> > > > > > > > > +     {
> > > > > > > > > +       CHAR *s1p = s1 + (page_size / CHARBYTES - s) - s1a;
> > > > > > > > > +       CHAR *s2p = s2 + (page_size / CHARBYTES - s) - s2a;
> > > > > > > > > +       exp_result = SIMPLE_STRNCMP (s1p, s2p, s);
> > > > > > > > > +       FOR_EACH_IMPL (impl, 0)
> > > > > > > > > +         check_result (impl, s1p, s2p, s, exp_result);
> > > > > > > > > +     }
> > > > > > > > > +}
> > > > > > > >
> > > > > > > > There are lots of magic numbers here.
> > > > > > > >
> > > > > > > > Could you add some context around those number
> > > > > > >
> > > > > > > My commit log says
> > > > > > >
> > > > > > > ---
> > > > > > > Add a strncmp testcase to cover cases where one of strings ends on the
> > > > > > > page boundary.
> > > > > > > ---
> > > > > >
> > > > > > Which says nothing about why you need to test over 90000 different
> > > > >
> > > > > Loops in check3 have about 60, not 90000, different cases
> > > > > according to my calculation.
> > > >
> > > > OK. I saw the magic 99, 31, and 30, and didn't account for the magic 32s.
> > > > The second "loop" doesn't even loop, as it's a single iteration at 31. (Why?)
> > >
> > > I can remove the second loop.
> > >
> > > > The third loop is just 30 and 31.
> > > >
> > > > It's all needlessly complex and confusing.
> > >
> > > It is designed to trigger the bug.
> >
> > What bug??
> 
> It is in subject: [PATCH 1/4] strncmp: Add a testcase for page
> boundary [BZ #25933]
> 
>                                                ^^^^^^^^^^

Sorry, I totally missed that.

_Digesting_ the information in that bug is a challenge for another day.
Synopsis is something like "there's a problem when comparing strings
which are about 128 bytes and end near a page boundary", is that fair?

BTW, I added Adhemerval to CC, since he was active in the bug.

> > > > > > cases of a string ending on a page boundary, nor what any of the
> > > > > > magic numbers represent.
> > > > >
> > > > > AVX vector size is 32 bytes.  Each AVX2 loop iteration processes
> > > > > 4 * 32 bytes.   check3 covers cases where one of strings ends on
> > > > > the page boundary with the maximum string length less than the
> > > > > number bytes of each AVX2 loop iteration and different offsets from
> > > > > page boundary.   Here is the updated patch with added comments.
> > > >
> > > > I suggest making the implementation more generic.  "32" isn't magic
> > > > for every architecture.
> > >
> > > But it is the key to trigger the bug.
> > >
> > > > Your v2 suggests:
> > > > > +  /* Check AVX2 loop unrolling with the maximum string length less
> > > > > +     than 4 * 32 bytes and different offsets from page boundary.  */
> > > >
> > > > Is it common to include architecture-specific comments (above) and code
> > > > (below) in common code?
> > >
> > > I simply explain why the testcase is written this way.
> > >
> > > > > +  for (size_t s = 99; s <= size; s++)
> > > > > +    for (size_t s1a = 31; s1a < 32; s1a++)
> > > > > +      for (size_t s2a = 30; s2a < 32; s2a++)
> > > > > +       {
> > > > > +         CHAR *s1p = s1 + (page_size / CHARBYTES - s) - s1a;
> > > > > +         CHAR *s2p = s2 + (page_size / CHARBYTES - s) - s2a;
> > > > > +         exp_result = SIMPLE_STRNCMP (s1p, s2p, s);
> > > > > +         FOR_EACH_IMPL (impl, 0)
> > > > > +           check_result (impl, s1p, s2p, s, exp_result);
> > > > > +       }
> > > >
> > > > If you just want to approach the end of the page from different offsets,
> > > > would something like the following suffice (not tested)?
> > > >
> > > > +  /* Pick a length which hopefully encompasses most cache line lengths
> > > > +     and vector sizes including loop unrolling.  */
> > > > +  for (size_t s = 256; s; s--)
> > > > +    {
> > > > +      CHAR *s1p = s1 + (page_size / CHARBYTES - s) + 1;
> > >
> > > When s <= 1, s1p will point beyond the end of buf1.
> > >
> > > > +      CHAR *s2p = s2 + (page_size / CHARBYTES - 256);
> > > > +      exp_result = SIMPLE_STRNCMP (s1p, s2p, s);
> > > > +      FOR_EACH_IMPL (impl, 0)
> > > > +        check_result (impl, s1p, s2p, s, exp_result);
> > > > +      exp_result = SIMPLE_STRNCMP (s2p, s1p, s);
> > > > +      FOR_EACH_IMPL (impl, 0)
> > > > +        check_result (impl, s2p, s1p, s, exp_result);
> > > > +    }
> > > >
> > > > If 256 isn't enough in the general case, maybe just use something O(page_size).
> > >
> > > This loop will trigger the bug.  The key is 31/30.
> > >
> > >  /* Pick a length which hopefully encompasses most cache line lengths
> > >      and vector sizes including loop unrolling.  */
> > >   for (size_t s = 256; s; s--)
> > >     {
> > >       CHAR *s1p = s1 + (page_size / CHARBYTES - s) - 31;
> > >       CHAR *s2p = s2 + (page_size / CHARBYTES - s) - 30;
> > >       exp_result = SIMPLE_STRNCMP (s1p, s2p, s);
> > >       {
> > >         FOR_EACH_IMPL (impl, 0)
> > >           check_result (impl, s1p, s2p, s, exp_result);
> > >       }
> > >       exp_result = SIMPLE_STRNCMP (s2p, s1p, s);
> > >       {
> > >         FOR_EACH_IMPL (impl, 0)
> > >           check_result (impl, s2p, s1p, s, exp_result);
> > >       }
> > >     }
> >
> > I think it would help to explain the exact scenario that triggers "the bug".
> >
> > The 31/30 are still magic numbers with no significance to anyone looking
> > at the code.
> >
> > Can you craft a reasonably generic loop that covers the case you need?
> > Or, should you create an arch-specific test?
> >
> 
> How about this one?
> 
>   for (size_t s = 256; s; s--)
>     for (size_t s1a = 0; s1a < 32; s1a++)
>       for (size_t s2a = 0; s2a < 32; s2a++)
>         {
>           CHAR *s1p = s1 + (page_size / CHARBYTES - s) - s1a;
>           CHAR *s2p = s2 + (page_size / CHARBYTES - s) - s2a;
>           exp_result = SIMPLE_STRNCMP (s1p, s2p, s);
>           FOR_EACH_IMPL (impl, 0)
>             check_result (impl, s1p, s2p, s, exp_result);
>         }

So that's a 1/4 million tests.

Is there a smaller subset which covers your needs?

Something like this perhaps? (not tested)

     for (size_t s1a = 0; s1a < 128; s1a++)
       for (size_t s2a = s1a; s2a < 128; s2a++)
         {
           CHAR *s1p = s1 + page_size / CHARBYTES - s1a;
           CHAR *s2p = s2 + page_size / CHARBYTES - s2a;
           exp_result = SIMPLE_STRNCMP (s1p, s2p, s2a);
           FOR_EACH_IMPL (impl, 0)
             check_result (impl, s1p, s2p, s2a, exp_result);
         }

PC


More information about the Libc-alpha mailing list