This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Am 25.11.2014 21:36, schrieb OndÅej BÃlka:
On Tue, Nov 25, 2014 at 09:04:04PM +0100, Leonhard Holz wrote:Am 24.11.2014 00:47, schrieb OndÅej BÃlka:On Sun, Nov 23, 2014 at 11:52:06PM +0100, Leonhard Holz wrote:Hi OndÅej, as far as I understood, the current strcoll implementation scans both strings for collation sequences and compares the weights of them, whereby a collation sequence can be multiple bytes long. So whatever strcmp_l returns as index, you would need a general way of finding the start of the collation sequence this index is in. Unfortunately I cannot tell if or how this can be done.As I wrote below you do not have to do that. Just precompute a table that is zero for characters that are part of some collation sequence and use old method when one of compared characters is in that table.Ok, I understand the idea and it would be great if it worked. BTW do you know how UTF-8 chars above 7F are handled?A UTF-8 char consist of starting byte larger than 0xbf followed by characters in 0x80-0xbf range, see http://en.wikipedia.org/wiki/UTF-8
Sorry for confusion. The question was ought to ask how the algorithm handles them. E.g. what to do when strcmp stops at a char with value 0x81.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |