This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] regexec: Fix off-by-one bug in weight comparison [BZ #23036]

From: Florian Weimer <fw at deneb dot enyo dot de>
To: Carlos O'Donell <carlos at redhat dot com>
Cc: Florian Weimer <fweimer at redhat dot com>, libc-alpha at sourceware dot org
Date: Mon, 09 Jul 2018 21:18:57 +0200
Subject: Re: [PATCH] regexec: Fix off-by-one bug in weight comparison [BZ #23036]
References: <20180709172046.D1DA743994575@oldenburg.str.redhat.com> <9e8fe218-4dfa-d6ce-4ce7-16efc41d30d6@redhat.com>

* Carlos O'Donell:

>> -			while (cnt <= weight_len
>> -			       && (weights[equiv_class_idx + 1 + cnt]
>> -				   == weights[idx + 1 + cnt]))
>
> Here we start at count 0 and go to <= weight_len.
>
> This is one byte too far.
>
> In an N-length weight:
>
> |L01234...[N-1]|
>  ^^        ^
>  ||        |--- weights [idx + 1 + (weight_len - 1)]
>  ||--- weights[idx + 1]
>  |--- weights
>
> L == N == weight_len
>
> So the loop for cnt <= weight_len goes one byte beyond the weights array.

Right.  I wasn't able to derive the data layout from
locale/programs/ld-collate.c, but there's this code in
string/strxfrm_l.c:

/* Find next weight and rule index.  Inlined since called for every char.  */
static __always_inline size_t
find_idx (const USTRING_TYPE **us, int32_t *weight_idx,
	  unsigned char *rule_idx, const locale_data_t *l_data, const int pass)
{
  int32_t tmp = findidx (l_data->table, l_data->indirect, l_data->extra, us,
			 -1);
  *rule_idx = tmp >> 24;
  int32_t idx = tmp & 0xffffff;
  size_t len = l_data->weights[idx++];

  /* Skip over indices of previous levels.  */
  for (int i = 0; i < pass; i++)
    {
      idx += len;
      len = l_data->weights[idx++];
    }

  *weight_idx = idx;
  return len;
}

This makes it abundantly clear that the length element does not count
itself in the length.

References:
- [PATCH] regexec: Fix off-by-one bug in weight comparison [BZ #23036]
  - From: Florian Weimer
- Re: [PATCH] regexec: Fix off-by-one bug in weight comparison [BZ #23036]
  - From: Carlos O'Donell

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]