ctype macros broken on 64-bits builds?
Thu Jul 24 18:02:00 GMT 2008
Martijn van Buul wrote:
> I'm using newlib on x86_64-elf, and I've ran into problems with the various
> is...() macros in ctype.h. According to C90 (and C99, and possibly earlier
> standards before that..) these macros/functions are required to accept an
> integer input with range [-1 .. 255]. It appears they are currently broken
> for 64-bits targets. As an example, I used isalpha(), but the others have
> exactly the same problem:
> isalpha() is defined in ctype as:
> #define isalpha(c) ((__ctype_ptr)[(unsigned)(c)]&(_U|_L))
> where __ctype_ptr points to element 1 of a 257-entry array, so
> __ctype_ptr[-1] is actually valid.
> This works for targets with 32-bit pointers and 32-bits integers,
> as accessing element [-1] from an array will access exactly the same memory
> as accessing element[(unsigned)(-1)], as there will be an implicit
> Assuming a char foo, I'd get:
> &foo: 0xbd4de86
> &foo[-1]: 0xbd4de85
> &foo[(unsigned)(-1)]: 0xbd4de85
> However, this no longer works on a platform with 32 bits integers and 64-bits
> pointers (like x86_64..), since the implicit overflow will not occur:
> &foo: 0x28000109c30
> &foo[-1]: 0x28000109c2f
> &foo[(unsigned)(-1)]: 0x28100109c2f
> Note how the [(unsigned) (-1)] address ended up 4GB -1 beyond the first
> element, instead of just before it.
> All in all, this means that using any of the ctype(3) macros with -1
> as an argument will cause a segmentation fault, where it should have been
> defined behaviour.
> It is the explicit cast to unsigned that's causing the problem here, as
> using (signed) would've yielded the expected result:
> &foo[(signed)(-1)]: 0x28000109c2f
> I rewrote all appropriate macros in ctype.h to cast to (signed) instead of
> (unsigned), with no adverse affects. My code no longer crashes now, but my
> testbed is limited so I don't know if this might break other targets.
> The alternative option would be to do what the rest of the world has been
> doing for a while (Including the BSDs, from which this ctype.* seems to
> have borrowed quite a bit), and rewriting isalpha and friends to
> #define isalpha(c) ((__ctype_ptr)[(unsigned)(c + 1)]&(_U|_L))
> with __ctype_ptr pointing at element 0 of the array in ctype/ctype_.c,
> instead of at element 1.
Thanks for catching this.
I have checked in the accompanying patch which implements the
alternative you mention above. To prevent breakage in existing code, I
have created a new pointer: __ctype_ptr__ and changed the ctype
macros/functions to use it.
Cygwin folks will probably need to add __ctype_ptr__ to the list of
If anybody finds any problems, just let me know.
-- Jeff J.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the Newlib