This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Am 24.11.2014 00:47, schrieb OndÅej BÃlka:
On Sun, Nov 23, 2014 at 11:52:06PM +0100, Leonhard Holz wrote:Hi OndÅej, as far as I understood, the current strcoll implementation scans both strings for collation sequences and compares the weights of them, whereby a collation sequence can be multiple bytes long. So whatever strcmp_l returns as index, you would need a general way of finding the start of the collation sequence this index is in. Unfortunately I cannot tell if or how this can be done.As I wrote below you do not have to do that. Just precompute a table that is zero for characters that are part of some collation sequence and use old method when one of compared characters is in that table.
Ok, I understand the idea and it would be great if it worked. BTW do you know how UTF-8 chars above 7F are handled?
From performance perspective these are not problem as they should be infrequent enough. Ignored ones are worse as they could make otherwise identical long prefixes different.BTW I have implemented a benchmark for strcoll that is not-yet-pushed because I didn't manage to patch the bench-tests Makefile to generate additionally needed locales (https://sourceware.org/ml/libc-alpha/2014-10/msg00431.html).
I can send you the test files if you like. Leonhard
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |