Bug in collation functions?

Eric Blake eblake@redhat.com
Thu Oct 29 18:09:00 GMT 2015


On 10/29/2015 10:13 AM, Ken Brown wrote:

> Never mind.  My test case was flawed, because it didn't check for the
> possibility that wcscoll might return 0.  Here's a revised definition of
> the "compare" function:
> 
> void
> compare (const wchar_t *a, const wchar_t *b, const char *loc)
> {
>   setlocale (LC_COLLATE, loc);
>   int res = wcscoll (a, b);
>   char c = res < 0 ? '<' : res > 0 ? '>' : '=';
>   printf ("\"%ls\" %c \"%ls\" in %s locale\n", a, c, b, loc);
> }
> 
> With this change (and the use of NORM_IGNORESYMBOLS) the test returns
> the following on Cygwin:
> 
> $ ./wcscoll_test
> "11" > "1.1" in POSIX locale
> "11" = "1.1" in en_US.UTF-8 locale
> "11" > "1 2" in POSIX locale
> "11" < "1 2" in en_US.UTF-8 locale
> 
> It still differs from Linux, but it's good enough to make the emacs test
> pass.  Moreover, this behavior actually seems more reasonable to me than
> the Linux behavior.  After all, if you're ignoring punctuation, how can
> you decide which of "11" or "1.1" comes first?

Careful.  POSIX is proposing some wording that say that normal locales
should always implement a fallback of last resort (and that locales that
do not do so should have a special name including '@', to make it
obvious).  It is not standardized yet, but worth thinking about.

http://austingroupbugs.net/view.php?id=938
http://austingroupbugs.net/view.php?id=963

The intent of that wording is that if ignoring punctuation could cause
two strings to otherwise compare equal, the fallback of a total ordering
on all characters means that the final result of strcoll() will not be 0
unless the two strings are identical.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 604 bytes
Desc: OpenPGP digital signature
URL: <http://cygwin.com/pipermail/cygwin/attachments/20151029/ef798567/attachment.sig>


More information about the Cygwin mailing list