[PATCH] Speed-up character range regexes by up to 2x
bonzini
paolo.bonzini@polimi.it
Mon Jan 12 13:07:00 GMT 2004
>>What follows the review of the "gawk guy"'s regex patch:
>>
>> > +#ifdef RE_ENABLE_I18N
>> > int icase = (dfa->mb_cur_max == 1 && (bufp->syntax & RE_ICASE));
>> > +#else
>> > + int icase = (bufp->syntax & RE_ICASE);
>> > +#endif
>>
>>This is unneeded.
>>
>>
I mean that you probably added these fixes when MB_CUR_MAX was used
because somebody got link errors for MB_CUR_MAX undefined; but now,
dfa->mb_cur_max cannot possibly be undefined and will always be 1 if
!RE_ENABLE_I18N.
>>> @@ -2558,8 +2564,8 @@
>>> ? __btowc (start_ch) : start_elem->opr.wch);
>>> end_wc = ((end_elem->type == SB_CHAR || end_elem->type == COLL_SYM)
>>> ? __btowc (end_ch) : end_elem->opr.wch);
>>> - cmp_buf[0] = start_wc;
>>> - cmp_buf[4] = end_wc;
>>> + cmp_buf[0] = start_wc != WEOF ? start_wc : start_ch;
>>> + cmp_buf[4] = end_wc != WEOF ? end_wc : end_ch;
>>> if (wcscoll (cmp_buf, cmp_buf + 4) > 0)
>>> return REG_ERANGE;
>>
>>I am not sure this is the fix; maybe it is better not to include the
>>character set if start_wc == WEOF || end_wc == WEOF, or to return
>>REG_ERANGE?
>>
>>
>I got WEOF and things core dumped (or equivalent).
>
>
Yes, it seems to me too that the case must be handled.
Paolo
More information about the Libc-alpha
mailing list