This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: New 64 bit wcscmp implementation


Hello,

The example mentioned by Joseph doesn't show that for old wcscmp
version the comparisons are always done in an unsigned type even when
wchar_t is signed.

It's wrong.

> wchar_t w0[] = { WCHAR_MIN, 0 };
> wchar_t w1[] = { WCHAR_MAX, 0 };
>
> int
> main (void)
> {
>  if (wcscmp (w0, w1) < 0)
>    return 0;
>  else
>    abort ();
> }

This example shows some other bug in old version coursed by using edge
values like WCHAR_MIN, WCHAR_MAX, if you will change it to less values
for example -1000, 1000 you will see that wcscmp (w0, w1) < 0.


So, new x86_32 and x86_32 implementations repeat SIGNED comparisons
semantics, but it was so not for all cases. I found some bugs
regarding with it.

I have a patch with fix and I add new test for catching it.
I checked test passes without errors for old wcscmp version, it fails
for both new implementations, but passes successfully after applying
the fix.

I also checked semantics of old version by looking at wcsmbc/wcscmp.c file.

Change log:
2011-09-16  Liubov Dmitrieva  <liubov.dmitrieva@gmail.com>

	* sysdeps/i386/i686/multiarch/wcscmp-sse2.S: Fix.
	Fix wrong unsigned compare semantics in some cases,
	it shall be signed.
	* sysdeps/x86_64/wcscmp.S: Likewise.
	* sysdeps/string/test-strcmp.c: Update.
--
Liubov Dmitrieva
Intel Corporation


2011/8/29 Joseph S. Myers <joseph@codesourcery.com>:
> On Mon, 29 Aug 2011, Dmitrieva Liubov wrote:
>
>> Hello,
>>
>> I've optimized wcscmp function (new implementation is written with
>> assembly language coding), it looks better than current version for
>> all machines with SSE2 supporting I have.
>> It improves performance by up to 3.6X.
>
> Do your versions avoid the bug in the present version that the comparisons
> are done in an unsigned type even though when wchar_t is signed they must
> be done in the signed wchar_t type?
>
> A testcase for this bug for wcscmp is:
>
> #include <stdlib.h>
> #include <wchar.h>
>
> wchar_t w0[] = { WCHAR_MIN, 0 };
> wchar_t w1[] = { WCHAR_MAX, 0 };
>
> int
> main (void)
> {
> ?if (wcscmp (w0, w1) < 0)
> ? ?return 0;
> ?else
> ? ?abort ();
> }
>
> and for wmemcmp:
>
> #include <stdlib.h>
> #include <wchar.h>
>
> wchar_t w0 = WCHAR_MIN;
> wchar_t w1 = WCHAR_MAX;
>
> int
> main (void)
> {
> ?if (wmemcmp (&w0, &w1, 1) < 0)
> ? ?return 0;
> ?else
> ? ?abort ();
> }
>
> Both tests fail at present. ?The relevant standard wording is "Unless
> explicitly stated otherwise, the functions described in this subclause
> order two wide characters the same way as two integers of the underlying
> integer type designated by wchar_t.". ?And, since there might be some
> question over validity of the negative values of wchar_t, C1X adds an
> explicit statement "Arguments to the functions in this subclause may point
> to arrays containing wchar_t values that do not correspond to members of
> the extended character set. Such values shall be processed according to
> the specified semantics, except that it is unspecified whether an encoding
> error occurs if such a value appears in the format string for a function
> in 7.29.2 or 7.29.5 and the specified semantics do not require that value
> to be processed by wcrtomb." to make it clear that like the narrow string
> and memory functions all wchar_t values are valid without regard to their
> semantics in any character set.
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
>

Attachment: fix_wcscmp.patch
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]