This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Fix up regcomp/regexec


Hi!

When building glibc with trunk gcc, many regex tests fail:
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/runtests.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/runptests.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/bug-regex16.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/bug-regex18.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/bug-regex20.out] Error 1
/[[:lower:]]+/: Unmatched [ or [^
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/transbug.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/tst-rxspencer.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/tst-boost.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/tst-pcre.out] Error 1
The problem is that parse_bracket_symbol is miscompiled, and it turns
out it is because of an incorrect attribute on re_string_fetch_byte_case.
Unlike re_string_peek_byte_case, this one is really not pure, it modifies memory
(increments pstr->cur_idx), and with the pure attribute GCC assumed it doesn't
and it cached the presumed value of regexp->cur_idx in a variable across the
  for (;; ++i)
    {
      if (i >= BRACKET_NAME_BUF_SIZE)
        return REG_EBRACK;
      if (token->type == OP_OPEN_CHAR_CLASS)
        ch = re_string_fetch_byte_case (regexp);
      else
        ch = re_string_fetch_byte (regexp);
      if (re_string_eoi(regexp))
        return REG_EBRACK;
      if (ch == delim && re_string_peek_byte (regexp, 0) == ']')
        break;
      elem->opr.name[i] = ch;
    }
re_string_fetch_byte_case (regexp) call and used that during
re_string_peek_byte, so on e.g.
#include <regex.h>
#include <stdlib.h>
int
main (void)
{
  regex_t reg;
  if (regcomp (&reg, "x[[:alnum:]]z", 0) != REG_NOERROR)
    abort ();
  return 0;
}
testcase it wouldn't terminate on the second ':' character, because
it would see re_string_peek_byte (regexp, 0) returning again ':' instead
of ']'.

Fixed thusly:

2011-12-30  Jakub Jelinek  <jakub@redhat.com>

	* posix/regex_internal.c (re_string_fetch_byte_case): Remove
	pure attribute.

--- libc/posix/regex_internal.c.jj	2011-11-23 11:06:23.000000000 +0100
+++ libc/posix/regex_internal.c	2011-12-30 16:56:38.948973129 +0100
@@ -868,7 +868,7 @@ re_string_peek_byte_case (const re_strin
 }
 
 static unsigned char
-internal_function __attribute ((pure))
+internal_function
 re_string_fetch_byte_case (re_string_t *pstr)
 {
   if (BE (!pstr->mbs_allocated, 1))

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]