Created attachment 9161 [details] Patch to correct the sorting order of 0D36 and 0D37 The Malayalam characters ശ(U+0D36) and ഷ(U+0D37) should be sorted just like the order of they unicode code points. In the master version of glibc, the order is swapped. It sorts ശ(U+0D36) after ഷ(U+0D37) which is a bug in the patch I submitted long time back(Bug 12541) $ LANG=ml_IN.UTF-8 sort ~/sort.txt വ ഷ ശ സ ഹ This is wrong. With the attached patch, I will get and it is correct. $ LANG=ml_IN.UTF-8 sort ~/sort.txt വ ശ ഷ സ ഹ The mistake was probably happened because of confusing naming of Unicode characters that have very similar pronunciation.
Hi Santhosh :) Patch looks good to me. Do you have any reference for this change?
References: (1) Unicode Malayalam Code chart http://www.unicode.org/charts/PDF/U0D00.pdf - This patch just follows the code point order and as per language rules, there is no reordering for collation that is different from code point order (2) https://en.wikipedia.org/wiki/Malayalam_script#Consonants (3) Samkshiptha Sabdatharavali Malayalam Dictionary 2011 DC Books
iiuc, the iso tables are generated now ...
Collation tables are not automatically generated. Character database is generated from Unicode data.
ping. The references are given above. Can somebody look into this?
I have tested this patch by applying to master branch and it is giving as expected results. Reviewed+ Santhosh did you submitted this patch to libc-alpha mailing list?? https://sourceware.org/ml/libc-alpha/
The patch was submitted to libc-alpha mailing list https://sourceware.org/ml/libc-alpha/2017-04/msg00305.html
Changelog: BZ 19919: Corrected the Malayalam sorting order of 0D36 and 0D37 (I hope the format is correct, feel free to edit as required)
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via b05eca0e1d96aecb25516287913c54bbb93d3d92 (commit) from 8458956a6219b6dbd97b0e9e97caf742f3c6342e (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=b05eca0e1d96aecb25516287913c54bbb93d3d92 commit b05eca0e1d96aecb25516287913c54bbb93d3d92 Author: Santhosh Thottingal <santhosh.thottingal@gmail.com> Date: Sun Jun 11 10:08:37 2017 -0400 Correct collation rules for Malayalam. [BZ #19922] * locales/iso14651_t1_common: Add collation rules for U+07DA to U+07DF. [BZ #19919] * locales/iso14651_t1_common: Correct collation of U+0D36 and U+0D37. ----------------------------------------------------------------------- Summary of changes: localedata/ChangeLog | 8 ++++++++ localedata/locales/iso14651_t1_common | 26 ++++++++++++++++++++++---- 2 files changed, 30 insertions(+), 4 deletions(-)
Will be fixed in 2.26.
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, release/2.25/master has been updated via f92b1025980a939645b1ec7e550411a05ac7c76f (commit) from b8d2e394a2900cef5bbbe0503f15960f64a943b1 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f92b1025980a939645b1ec7e550411a05ac7c76f commit f92b1025980a939645b1ec7e550411a05ac7c76f Author: Santhosh Thottingal <santhosh.thottingal@gmail.com> Date: Sun Jun 11 10:08:37 2017 -0400 Correct collation rules for Malayalam. [BZ #19922] * locales/iso14651_t1_common: Add collation rules for U+07DA to U+07DF. [BZ #19919] * locales/iso14651_t1_common: Correct collation of U+0D36 and U+0D37. ----------------------------------------------------------------------- Summary of changes: localedata/ChangeLog | 8 ++++++++ localedata/locales/iso14651_t1_common | 26 ++++++++++++++++++++++---- 2 files changed, 30 insertions(+), 4 deletions(-)
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, release/2.23/master has been updated via 9f172a30acdd64e140bedd438458830fa8c27ad8 (commit) from 0be74c5c7cb239e4884d1ee0fd48c746a0bd1a65 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9f172a30acdd64e140bedd438458830fa8c27ad8 commit 9f172a30acdd64e140bedd438458830fa8c27ad8 Author: Santhosh Thottingal <santhosh.thottingal@gmail.com> Date: Sun Jun 11 10:08:37 2017 -0400 Correct collation rules for Malayalam. [BZ #19922] * locales/iso14651_t1_common: Add collation rules for U+07DA to U+07DF. [BZ #19919] * locales/iso14651_t1_common: Correct collation of U+0D36 and U+0D37. ----------------------------------------------------------------------- Summary of changes: localedata/ChangeLog | 8 ++++++++ localedata/locales/iso14651_t1_common | 26 ++++++++++++++++++++++---- 2 files changed, 30 insertions(+), 4 deletions(-)
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, release/2.24/master has been updated via 4e291e7c5277af2eec279e2047653f04fad483e1 (commit) from 0505a57d4381f2baaeed73e96b161d0fb313fa5c (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4e291e7c5277af2eec279e2047653f04fad483e1 commit 4e291e7c5277af2eec279e2047653f04fad483e1 Author: Santhosh Thottingal <santhosh.thottingal@gmail.com> Date: Sun Jun 11 10:08:37 2017 -0400 Correct collation rules for Malayalam. [BZ #19922] * locales/iso14651_t1_common: Add collation rules for U+07DA to U+07DF. [BZ #19919] * locales/iso14651_t1_common: Correct collation of U+0D36 and U+0D37. ----------------------------------------------------------------------- Summary of changes: localedata/ChangeLog | 8 ++++++++ localedata/locales/iso14651_t1_common | 26 ++++++++++++++++++++++---- 2 files changed, 30 insertions(+), 4 deletions(-)
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, linaro/2.23/master has been updated via ceeb0740ed04c48170f9f6f15fef55637ad84e1b (commit) via 24adabbe17d24b9cf4f42d81f546359f72515ce3 (commit) via 8224a992e15369224860c891e7367e6ab66f6fde (commit) via ed739093d19855c71b3f38bfed7d318340b22612 (commit) via fec2dc4089f6688e0f4ffc962700a0858f08bef9 (commit) from 6636d6f4fe5e6905bfe463874b4f958ed1ae4a84 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ceeb0740ed04c48170f9f6f15fef55637ad84e1b commit ceeb0740ed04c48170f9f6f15fef55637ad84e1b Author: Siddhesh Poyarekar <siddhesh@sourceware.org> Date: Tue Mar 7 20:52:04 2017 +0530 Ignore and remove LD_HWCAP_MASK for AT_SECURE programs (bug #21209) The LD_HWCAP_MASK environment variable may alter the selection of function variants for some architectures. For AT_SECURE process it means that if an outdated routine has a bug that would otherwise not affect newer platforms by default, LD_HWCAP_MASK will allow that bug to be exploited. To be on the safe side, ignore and disable LD_HWCAP_MASK for setuid binaries. [BZ #21209] * elf/rtld.c (process_envvars): Ignore LD_HWCAP_MASK for AT_SECURE processes. * sysdeps/generic/unsecvars.h: Add LD_HWCAP_MASK. (cherry picked from commit 1c1243b6fc33c029488add276e56570a07803bfd) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=24adabbe17d24b9cf4f42d81f546359f72515ce3 commit 24adabbe17d24b9cf4f42d81f546359f72515ce3 Author: Florian Weimer <fweimer@redhat.com> Date: Mon Jun 19 22:32:12 2017 +0200 ld.so: Reject overly long LD_AUDIT path elements Also only process the last LD_AUDIT entry. (cherry picked from commit 81b82fb966ffbd94353f793ad17116c6088dedd9) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8224a992e15369224860c891e7367e6ab66f6fde commit 8224a992e15369224860c891e7367e6ab66f6fde Author: Florian Weimer <fweimer@redhat.com> Date: Mon Jun 19 22:31:04 2017 +0200 ld.so: Reject overly long LD_PRELOAD path elements (cherry picked from commit 6d0ba622891bed9d8394eef1935add53003b12e8) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ed739093d19855c71b3f38bfed7d318340b22612 commit ed739093d19855c71b3f38bfed7d318340b22612 Author: Florian Weimer <fweimer@redhat.com> Date: Mon Jun 19 18:34:53 2017 +0200 CVE-2017-1000366: Ignore LD_LIBRARY_PATH for AT_SECURE=1 programs [BZ #21624] LD_LIBRARY_PATH can only be used to reorder system search paths, which is not useful functionality. This makes an exploitable unbounded alloca in _dl_init_paths unreachable for AT_SECURE=1 programs. (cherry picked from commit f6110a8fee2ca36f8e2d2abecf3cba9fa7b8ea7d) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fec2dc4089f6688e0f4ffc962700a0858f08bef9 commit fec2dc4089f6688e0f4ffc962700a0858f08bef9 Author: Santhosh Thottingal <santhosh.thottingal@gmail.com> Date: Sun Jun 11 10:08:37 2017 -0400 Correct collation rules for Malayalam. [BZ #19922] * locales/iso14651_t1_common: Add collation rules for U+07DA to U+07DF. [BZ #19919] * locales/iso14651_t1_common: Correct collation of U+0D36 and U+0D37. ----------------------------------------------------------------------- Summary of changes: ChangeLog | 32 ++++++ NEWS | 2 + elf/rtld.c | 198 +++++++++++++++++++++++++++------ localedata/ChangeLog | 8 ++ localedata/locales/iso14651_t1_common | 26 ++++- sysdeps/generic/unsecvars.h | 1 + 6 files changed, 230 insertions(+), 37 deletions(-)