Bug 11244 - re_compiler_pattern fails to diagnose [b-a] as an invalid range
Summary: re_compiler_pattern fails to diagnose [b-a] as an invalid range
Status: RESOLVED INVALID
Alias: None
Product: glibc
Classification: Unclassified
Component: regex (show other bugs)
Version: 2.11
: P2 normal
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-02-03 16:47 UTC by jim@meyering.net
Modified: 2014-06-30 18:51 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jim@meyering.net 2010-02-03 16:47:29 UTC
tested on rawhide, with glibc-2.11.90-10.x86_64:

$ cat regex-check.c
#include <config.h>
#include <limits.h>
#include <regex.h>
#include <string.h>

int
main (int argc, char **argv)
{
  char *regexp = (2 <= argc ? argv[1] : "a[b-a]");
  struct re_pattern_buffer regex;
  /* Ensure that [b-a] is diagnosed as invalid. */
  re_set_syntax (RE_SYNTAX_POSIX_EGREP);
  memset (&regex, 0, sizeof regex);
  const char *s = re_compile_pattern (regexp, strlen (regexp), &regex);
  return s == NULL;
}
$ echo '#define _GNU_SOURCE 1' > config.h
$ gcc -g -Wall -W -Wextra regex-check.c -I.
$ ./a.out
[Exit 1]

It looks like the code intends to diagnose that condition with REG_ERANGE,
but something is not working:

      start_collseq = lookup_collation_sequence_value (start_elem);
      end_collseq = lookup_collation_sequence_value (end_elem);
      /* Check start/end collation sequence values.  */
      if (BE (start_collseq == UINT_MAX || end_collseq == UINT_MAX, 0))
	return REG_ECOLLATE;
      if (BE ((syntax & RE_NO_EMPTY_RANGES) && start_collseq > end_collseq, 0))
	return REG_ERANGE;
Comment 1 Andreas Schwab 2010-02-03 17:12:32 UTC
(RE_SYNTAX_POSIX_EGREP & RE_NO_EMPTY_RANGES) == 0
Comment 2 jim@meyering.net 2010-02-03 17:47:33 UTC
(In reply to comment #1)
> (RE_SYNTAX_POSIX_EGREP & RE_NO_EMPTY_RANGES) == 0

Thanks.
The real problem is that the non-_LIBC code (which is used in gnulib) performs
the range test regardless of whether RE_NO_EMPTY_RANGES is set.  The non-LIBC
function currently lacks access to the "syntax" variable.