[PATCH/RFA] Fix ctype table and isblank

Jeff Johnston jjohnstn@redhat.com
Wed Apr 8 18:21:00 GMT 2009


Corinna Vinschen wrote:
> On Apr  8 09:13, Wizards' Guild wrote:
>   
>> Yes, we really should have an "alpha" flag AND a "blank" flag with the
>> new semantics.
>>     
>
> An "alpha" and "printable" flag.  We already habe "blank" so I don't
> understand what you're trying to accomplish with a new one.
>
>   
>> The obvious and common approach would be to widen the
>> table entries. I'm not a big fan of this because it bloats the
>> small-footprint systems. This is maybe why it hasn't been done
>> already?
>>     
>
> Keep in mind that until a couple of days ago newlib didn't even *have*
> extended charset support.  The problems we're discussing is a direct
> result of having more than just ASCII tables.
>
> The problem are smaller targets.  Maybe it would be a feasible approach
> to stick to one single 8 bit table (for ASCII) in case of small targets
> and to provide everything as 16 bit tables for targets which don't care
> the few extra K bytes.
>
> And widening the tables introduces a new problem for Cygwin, which has
> to keep some of the "old" stuff to maintain backward compatibility with
> applications built under an older version.  It would have to maintain
> the old ASCII ctype table, and copy over the data
> from the new tables in a more tricky way; we would have to keep the
> meaning of the current bit values and only extend flags to the upper
> bytes so that Cygwin can deal with that for older apps.  New apps would
> immediately profit from the new tables, of course.
>
> However, *if* we really do that, now would be the time.  Cygwin is on
> the verge of a new major release and since the extended ctype support is
> only available starting with this new release, we could use the
> opportunity to widen the character class tables now.
>
>   
>> Another approach would be to keep the "C" locale table and macros, but
>> if extended charsets are supported just convert everything to UNICODE
>> and handle it there.
>>     
>
> Oh, please no...
>
>   
>> Some functions, such as tolower, are already
>> using this approach.
>>     
>
> Only for chars > 0x80.  But that was meant as a temporary solution.
>
>   
>>  I don't care for the "half hardcoded" variation of
>> isblank; seems like an accident waiting to happen.
>>     
>
> It's exactly what we need for an *immediate* fix.  _B covers all types of
> spaces (SPC, NBSP), and the TAB goes extra so as not to be catched by
> isprint().
>
>   
>> Both _N and _X are locale-invariant, making them good candidates for
>> removal from the ctype table. If we wanted to recover ONE flag, I'd
>> take _N rather than _X. In this case add _X to the digits and use _X
>> in isprint, isgraph, and isalnum. It is possible to implement isdigit
>> as a standard macro with single evaluation:
>>
>> #define	isdigit(c) ((unsigned)((c)-'0')<=9)
>>
>> Hard choices...
>>     
>
> I don't think we have much of a choice.  Either we stick to the current
> approach and my isblank() fix, or we widen the tables.
>
>   
I think the following makes sense in light that isalpha isn't properly 
supported in the
present scheme.

1. Widen the tables (16 bits) and create a new ctype ptr with a new name.
2. Keep the old ctype ptr pointing to the old ASCII table.
3. Support either new or old mechanism in ctype.h based on a 
sys/config.h flag (e.g _ASCII_ONLY)
4. Support isblank in old mechanism as proposed, but add an intermediate 
variable to
avoid evaluating the argument more than once or call internal function 
(int d = (c), __ctype_ptr[d+1] ...)
5. Cygwin will set itself up to use the new mechanism

Existing code that doesn't care about additional charsets won't expand 
in size and will continue
to work as before. Old code wasn't using isblank before so I am not 
concerned that the macro is a
little different than the other isxxxx macros. Old code that doesn't 
recompile will work as before (ASCII) and will continue to link (no 
change in size). Code wishing to get the new locale support will have 
the platform
set the new flag and recompile which is reasonable (major release point 
or new platform altogether).

Comments?

-- Jeff J.
> Corinna
>
>   



More information about the Newlib mailing list