Bug 3957 - regcomp with REG_NEWLINE flag does operate as POSIX specification for a non-matching list
Summary: regcomp with REG_NEWLINE flag does operate as POSIX specification for a non-m...
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: regex (show other bugs)
Version: 2.4
: P2 normal
Target Milestone: ---
Assignee: Jakub Jelinek
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-02-02 13:23 UTC by Andy
Modified: 2018-04-20 14:01 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andy 2007-02-02 13:23:02 UTC
Given the string ‘foo\nbar’ (where \n is a linefeed) the regular expression 
‘foo[^ ]+’ matches the complete string. The regex is compiled with REG_EXTENDED 
and REG_NEWLINE flags.

The POSIX specification at 
http://www.opengroup.org/onlinepubs/009695399/functions/regcomp.html

Says for the REG_NEWLINE flag that

“A <newline> in string shall not be matched by a period outside a bracket 
expression or by any form of a non-matching list “

For older versions of glicb (glibc-2.1.3) the behaviour of regcomp is as the 
POSIX specification. For glibc-2.5 and from at least glibc-2.3.2 this is not 
the behaviour

The following code demonstrates the issue

#include <stdio.h>
#include <sys/types.h>
#include <regex.h>

int main(int argc, char **argv)
{
  char regex[] = "foo[^ ]+";
  char text[] = "foo\nbar";
  regex_t preg;
  regmatch_t pmatch[1];
  int flags = REG_EXTENDED | REG_NEWLINE;
  int i;

  printf("About to compile regexp '%s', with flags %d\n", regex, flags);

  if(!regcomp(&preg, regex, flags))
  {  
    printf("About to search string '%s'\n", text);

    if(!regexec(&preg, text, 1, pmatch, 0))
    {
      printf("Regex matched, match text is '");
      for(i = pmatch[0].rm_so; i < pmatch[0].rm_eo; i++)
      {
        printf("%c", text[i]); 
      }
      printf("'\n");
    }
    else
    {
      printf("Regex did not match\n");
    }
    
    regfree(&preg);
  }
  else
  {
    printf("Failed to compile regex\n");
  }

  return 0;  
}

On glib-2.3.2, glibc-2.3.6 or glibc-2.5 the program gives 

About to compile regexp 'foo[^ ]+', with flags 5
About to search string 'foo
bar'
Regex matched, match text is 'foo
bar'

On glibc-2.1.3 and other C libraries such as found on Solaris 9 the output is 

About to compile regexp 'foo[^ ]+', with flags 9
About to search string 'foo
bar'
Regex did not match

Which I believe is the expected POISX behaviour.
Comment 1 Jakub Jelinek 2007-02-05 13:42:47 UTC
Testing a fix.
Comment 2 Ulrich Drepper 2007-02-05 15:24:15 UTC
Fixed upstream.
Comment 3 Sourceware Commits 2007-07-12 14:50:30 UTC
Subject: Bug 3957

CVSROOT:	/cvs/glibc
Module name:	libc
Branch: 	glibc-2_5-branch
Changes by:	jakub@sourceware.org	2007-07-12 14:50:17

Modified files:
	.              : ChangeLog 
	posix          : Makefile regcomp.c 
Added files:
	posix          : bug-regex27.c bug-regex28.c 

Log message:
	2007-02-05  Jakub Jelinek  <jakub@redhat.com>
	
	[BZ #3957]
	* posix/regcomp.c (parse_bracket_exp): Set '\n' bit rather than '\0'
	bit for RE_HAT_LISTS_NOT_NEWLINE.
	(build_charclass_op): Remove bogus comment.
	* posix/Makefile (tests): Add bug-regex27 and bug-regex28.
	* posix/bug-regex27.c: New test.
	* posix/bug-regex28.c: New test.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/ChangeLog.diff?cvsroot=glibc&only_with_tag=glibc-2_5-branch&r1=1.10362.2.39&r2=1.10362.2.40
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/posix/bug-regex27.c.diff?cvsroot=glibc&only_with_tag=glibc-2_5-branch&r1=NONE&r2=1.1.6.1
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/posix/bug-regex28.c.diff?cvsroot=glibc&only_with_tag=glibc-2_5-branch&r1=NONE&r2=1.1.6.1
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/posix/Makefile.diff?cvsroot=glibc&only_with_tag=glibc-2_5-branch&r1=1.193.2.1&r2=1.193.2.2
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/posix/regcomp.c.diff?cvsroot=glibc&only_with_tag=glibc-2_5-branch&r1=1.112&r2=1.112.2.1

Comment 4 Jackie Rosen 2014-02-16 17:44:15 UTC Comment hidden (spam)