Bug 14095 - Review / update collation data from Unicode / ISO 14651
Summary: Review / update collation data from Unicode / ISO 14651
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: localedata (show other bugs)
Version: 2.15
: P2 normal
Target Milestone: 2.28
Assignee: Mike FABIAN
URL:
Keywords:
Depends on:
Blocks: 16052
  Show dependency treegraph
 
Reported: 2012-05-10 20:32 UTC by Joseph Myers
Modified: 2018-03-31 12:32 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Joseph Myers 2012-05-10 20:32:11 UTC
The localedata/locales/iso14651_t1_* files are probably, from their names, originally based on some version of ISO 14651 collation data.  They should be updated if possible to be based on the current Unicode collation data and algorithms.

http://www.unicode.org/reports/tr10/

Since there have been a lot of changes to these files since the original addition in

2000-05-24  Ulrich Drepper  <drepper@redhat.com>

        * locales/iso14651_t1: New file.

it's likely there will be a lot of work to understand how the files relate to ISO 14651 and what local changes are still relevant.
Comment 1 Paul Wise 2015-06-30 03:52:47 UTC
Why did glibc fork the Unicode collation data instead of sending changes upstream?
Comment 2 joseph@codesourcery.com 2015-06-30 11:14:35 UTC
The people involved in getting the collation data to its present state are 
mostly no longer involved in glibc development, so if you want an 
authoritative answer you'll need to do a lot of work tracking them down.  
My hypothesis would be that each person submitting a change generally had 
their own itch to scratch (supporting collation for their own language 
better, with no interest in a more general update to a newer version of 
ISO 14651, if a newer version even existed at that time, or insufficient 
time / expertise / resources to get involved in their national standards 
committees parallel to JTC1/SC2/WG2, if ISO 14651 did not support their 
language then) and that each person accepting such a change decided that 
it was better to have the incremental improvement than to have no 
collation support for that language for the indefinite future until 
someone appeared to contribute a more thorough update.

We don't, however, need to know people's motivations for making 
incremental changes rather than larger bulk updates.  The questions that 
are actually relevant for updating the data now are more along the lines 
of: for the original addition of the ISO 14651 data, what differences are 
there from the relevant version of ISO 14651?  Do those differences relate 
to conceptual differences between the POSIX collation model and the ISO 
14651 collation model, or do they reflect different choices for how to 
collate particular characters?  If they reflect different choices, do we 
still agree that those choices are appropriate for the contexts in which 
glibc locales are used, or, with hindsight, would the ISO 14651 choices 
now be better?  Where a change was made subsequently affecting existing 
characters, is the change still at variance with current ISO 14651, and do 
we think there is still a good reason for such a difference?  Where 
collation support for new characters was added, how does that support 
compare to the support, if any, for those characters in current ISO 14651, 
and are there any differences we think are deliberate and should be 
preserved?  Do any differences reflect cases where e.g. different national 
standards specify different collation for the same characters (or 
collation differs by context), and so individual locales may need to 
override the generic international version?

Yes, there is a lot of detailed, careful work involved in analysis of the 
history of the current collation data in order to produce a justified 
analysis of those questions with recommendations for how to use data from 
current ISO 14651.  Given the responsibility to users to avoid 
regressions, we need to understand what changes would be involved in such 
an update, and satisfy ourselves that they are good changes rather than 
regressions, as part of making such an update.  Contributors willing to 
help with that careful analysis are welcome.
Comment 3 Carlos O'Donell 2015-06-30 13:44:55 UTC
(In reply to joseph@codesourcery.com from comment #2)
> Yes, there is a lot of detailed, careful work involved in analysis of the 
> history of the current collation data in order to produce a justified 
> analysis of those questions with recommendations for how to use data from 
> current ISO 14651.  Given the responsibility to users to avoid 
> regressions, we need to understand what changes would be involved in such 
> an update, and satisfy ourselves that they are good changes rather than 
> regressions, as part of making such an update.  Contributors willing to 
> help with that careful analysis are welcome.

I agree completely with Joseph.
Comment 4 keld@keldix.com 2015-06-30 15:29:40 UTC
On Tue, Jun 30, 2015 at 11:14:35AM +0000, joseph at codesourcery dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=14095
> 
> --- Comment #2 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
> The people involved in getting the collation data to its present state are 
> mostly no longer involved in glibc development, so if you want an 
> authoritative answer you'll need to do a lot of work tracking them down.  
> My hypothesis would be that each person submitting a change generally had 
> their own itch to scratch (supporting collation for their own language 
> better, with no interest in a more general update to a newer version of 
> ISO 14651, if a newer version even existed at that time, or insufficient 
> time / expertise / resources to get involved in their national standards 
> committees parallel to JTC1/SC2/WG2, if ISO 14651 did not support their 
> language then) and that each person accepting such a change decided that 
> it was better to have the incremental improvement than to have no 
> collation support for that language for the indefinite future until 
> someone appeared to contribute a more thorough update.
> 
> We don't, however, need to know people's motivations for making 
> incremental changes rather than larger bulk updates.  The questions that 
> are actually relevant for updating the data now are more along the lines 
> of: for the original addition of the ISO 14651 data, what differences are 
> there from the relevant version of ISO 14651?  Do those differences relate 
> to conceptual differences between the POSIX collation model and the ISO 
> 14651 collation model, or do they reflect different choices for how to 
> collate particular characters?  If they reflect different choices, do we 
> still agree that those choices are appropriate for the contexts in which 
> glibc locales are used, or, with hindsight, would the ISO 14651 choices 
> now be better?  Where a change was made subsequently affecting existing 
> characters, is the change still at variance with current ISO 14651, and do 
> we think there is still a good reason for such a difference?  Where 
> collation support for new characters was added, how does that support 
> compare to the support, if any, for those characters in current ISO 14651, 
> and are there any differences we think are deliberate and should be 
> preserved?  Do any differences reflect cases where e.g. different national 
> standards specify different collation for the same characters (or 
> collation differs by context), and so individual locales may need to 
> override the generic international version?
> 
> Yes, there is a lot of detailed, careful work involved in analysis of the 
> history of the current collation data in order to produce a justified 
> analysis of those questions with recommendations for how to use data from 
> current ISO 14651.  Given the responsibility to users to avoid 
> regressions, we need to understand what changes would be involved in such 
> an update, and satisfy ourselves that they are good changes rather than 
> regressions, as part of making such an update.  Contributors willing to 
> help with that careful analysis are welcome.

Well, I was the author of many of the collation specs for different
languages, and I am still around, and I have even joined glibc maintenance
just a few years ago.

The 14651 and POSIX model are the same, or 14651 is backwards compatible
with Posix. We cannot say that we are following POSIX straightly,
then we could not have locales working, as POSIX is not well suited for
ISO 10646 UCS. So we are not adhering to POSIX, but rather 14651.

The different locale collation data were designed to adhere to
14651, in an orthogonal way, just like 14651 was designed to be used.

I am willing to contribute with a look on the different issues.

Best regards
Keld
Comment 5 joseph@codesourcery.com 2015-06-30 16:03:54 UTC
On Tue, 30 Jun 2015, keld at keldix dot com wrote:

> I am willing to contribute with a look on the different issues.

That would be very helpful, thanks!  The first question would probably be 
where the original iso14651_t1 file (added in commit 
b0a3e2e6238f4846bc7a99145d2721b8d5b5ec31 in the history repository) came 
from; if we can reproduce it from old ISO 14651 data, we can hopefully 
build a corresponding file from current ISO 14651 data - and then start to 
understand, for all the changes made to the data over the past 15 years, 
which of them are still relevant and desirable given current ISO 14651 / 
Unicode data as a base, and what the right way is to handle those changes.
Comment 6 keld@keldix.com 2015-07-01 07:58:28 UTC
On Tue, Jun 30, 2015 at 04:03:54PM +0000, joseph at codesourcery dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=14095
> 
> --- Comment #5 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
> On Tue, 30 Jun 2015, keld at keldix dot com wrote:
> 
> > I am willing to contribute with a look on the different issues.
> 
> That would be very helpful, thanks!  The first question would probably be 
> where the original iso14651_t1 file (added in commit 
> b0a3e2e6238f4846bc7a99145d2721b8d5b5ec31 in the history repository) came 
> from; if we can reproduce it from old ISO 14651 data, we can hopefully 
> build a corresponding file from current ISO 14651 data - and then start to 
> understand, for all the changes made to the data over the past 15 years, 
> which of them are still relevant and desirable given current ISO 14651 / 
> Unicode data as a base, and what the right way is to handle those changes.

It is my plan to work with the editor of 14651 on making the 14651
data directly useable with glibc. This is not currently the case
and we know it.

Keld
Comment 7 Mike Frysinger 2016-02-19 07:05:37 UTC
any update ?  we've got these shiny new unicode-gen/ python scripts for importing unicode data ...
Comment 8 joseph@codesourcery.com 2016-02-19 17:14:32 UTC
I expect reviewing the sources of and past changes to collation data, and 
writing suitable scripts to reproduce it from old upstream data / 
regenerate it from new upstream data, taking due account of any deliberate 
differences, to be substantially more work than the update of other data 
from Unicode was.
Comment 9 Mike FABIAN 2017-12-14 16:58:51 UTC
(In reply to joseph@codesourcery.com from comment #8)
> I expect reviewing the sources of and past changes to collation data, and 
> writing suitable scripts to reproduce it from old upstream data / 
> regenerate it from new upstream data, taking due account of any deliberate 
> differences, to be substantially more work than the update of other data 
> from Unicode was.

I am actually working on an update, but it is indeed not easy at all
and a lot of work.

https://www.iso.org/standard/68309.html

has a newer version of ISO/IEC 14651:2016

downloadable from:

http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html

And one can download this:

http://standards.iso.org/ittf/PubliclyAvailableStandards/c068309_ISO_IEC_14651_2016_Electronic_inserts.zip

Which contains a file named
 
ISO14651_2015_TABLE1_en.txt

which can be used as an update for our 
localedata/locales/iso14651_t1_common file 

But the collation symbols in the new file have changed a lot
and many adaptations in LC_COLLATE in many of our locales
are necessary, many of them a bit complicated.

I think this is the right way to go though, and I made
good progress so far, so I am quite confident now that 
I can do this.
Comment 10 cvs-commit@gcc.gnu.org 2018-02-27 16:54:58 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  874c56d7979858bbb1bb1604c55769ad0ce7a072 (commit)
       via  159738548130d5ac4fe6178977e940ed5f8cfdc4 (commit)
       via  ce6636b06b67d6bb9b3d6927bf2a926b9b7478f5 (commit)
       via  ac3a3b4b0d561d776b60317d6a926050c8541655 (commit)
       via  770cbe147cf33580e05ba6de78993c3070c5c2f8 (commit)
       via  0fc355d9a7b3cc9d5e4190ce929e1eb4459ef0ea (commit)
       via  43f3893f4b5679cb9eb93300b18f7febd17e5239 (commit)
       via  df74ef786f9c87ce5404df3b68a91cb9d2c4c26f (commit)
       via  d5adfbadd47e6836a7ddae54fba9f88e2b3354db (commit)
       via  5f5a96109187b4bb4a10b62139ab1c7fe45f7c1d (commit)
       via  8a97e9002ffa807b49e1222e5a9d51ce7896f209 (commit)
       via  bbdd2fba7d36d8f03c919b34f95238d8cf248b47 (commit)
       via  1569e551aff088ed48e2694b07045256f3582271 (commit)
       via  9479b6d5e08eacce06c6ab60abc9b2f4eb8b71e4 (commit)
      from  93d260ddda87a124d3fbb9af400fa154cfd00b4b (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=874c56d7979858bbb1bb1604c55769ad0ce7a072

commit 874c56d7979858bbb1bb1604c55769ad0ce7a072
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Thu Dec 21 18:56:52 2017 +0100

    Remove the lines from cmn_TW.UTF-8.in which cannot work at the moment.
    
    See this bug https://sourceware.org/bugzilla/show_bug.cgi?id=22898
    
    These lines don’t yet work because of a glibc bug, not because of
    problems in the locale data. No matter what sorting rules one uses,
    these characters cannot be sorted at all at the moment.
    
    As soon as that bug is fixed, these lines should be added back to the
    test file.
    
    	* localedata/cmn_TW.UTF-8.in: Remove the lines which cannot
            be sorted correctly at the moment because of a bug.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=159738548130d5ac4fe6178977e940ed5f8cfdc4

commit 159738548130d5ac4fe6178977e940ed5f8cfdc4
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Mon Dec 11 18:26:22 2017 +0100

    Adapt collation in several locales to the new iso14651_t1_common file
    
    [BZ #22550] - es_ES locale (and other es_* locales): collation should
    treat ñ as a primary different character, sync the collation
    for Spanish with CLDR
    [BZ #21547] - Tibetan script collation broken (Dzongkha and Tibetan)
    
    	* localedata/Makefile: Add new test files.
    	* localedata/lv_LV.UTF-8.in: Adapt test file to new collation order.
    	* localedata/sv_SE.ISO-8859-1.in: Adapt test file to new collation order.
    	* localedata/uk_UA.UTF-8.in: Adapt test file to new collation order.
    	* localedata/am_ET.UTF-8.in: New test file.
    	* localedata/az_AZ.UTF-8.in: Likewise.
    	* localedata/be_BY.UTF-8.in: Likewise.
    	* localedata/ber_DZ.UTF-8.in: Likewise.
    	* localedata/ber_MA.UTF-8.in: Likewise.
    	* localedata/bg_BG.UTF-8.in: Likewise.
    	* localedata/br_FR.UTF-8.in: Likewise.
    	* localedata/cmn_TW.UTF-8.in: Likewise.
    	* localedata/crh_UA.UTF-8.in: Likewise.
    	* localedata/csb_PL.UTF-8.in: Likewise.
    	* localedata/cv_RU.UTF-8.in: Likewise.
    	* localedata/cy_GB.UTF-8.in: Likewise.
    	* localedata/dz_BT.UTF-8.in: Likewise.
    	* localedata/eo.UTF-8.in: Likewise.
    	* localedata/es_ES.UTF-8.in: Likewise.
    	* localedata/fa_IR.UTF-8.in: Likewise.
    	* localedata/fi_FI.UTF-8.in: Likewise.
    	* localedata/fil_PH.UTF-8.in: Likewise.
    	* localedata/fur_IT.UTF-8.in: Likewise.
    	* localedata/gez_ER.UTF-8@abegede.in: Likewise.
    	* localedata/ha_NG.UTF-8.in: Likewise.
    	* localedata/ig_NG.UTF-8.in: Likewise.
    	* localedata/ik_CA.UTF-8.in: Likewise.
    	* localedata/kk_KZ.UTF-8.in: Likewise.
    	* localedata/ku_TR.UTF-8.in: Likewise.
    	* localedata/ky_KG.UTF-8.in: Likewise.
    	* localedata/ln_CD.UTF-8.in: Likewise.
    	* localedata/mi_NZ.UTF-8.in: Likewise.
    	* localedata/ml_IN.UTF-8.in: Likewise.
    	* localedata/mn_MN.UTF-8.in: Likewise.
    	* localedata/mr_IN.UTF-8.in: Likewise.
    	* localedata/mt_MT.UTF-8.in: Likewise.
    	* localedata/nb_NO.UTF-8.in: Likewise.
    	* localedata/om_KE.UTF-8.in: Likewise.
    	* localedata/os_RU.UTF-8.in: Likewise.
    	* localedata/ps_AF.UTF-8.in: Likewise.
    	* localedata/ro_RO.UTF-8.in: Likewise.
    	* localedata/ru_RU.UTF-8.in: Likewise.
    	* localedata/sc_IT.UTF-8.in: Likewise.
    	* localedata/se_NO.UTF-8.in: Likewise.
    	* localedata/sq_AL.UTF-8.in: Likewise.
    	* localedata/sv_SE.UTF-8.in: Likewise.
    	* localedata/szl_PL.UTF-8.in: Likewise.
    	* localedata/tg_TJ.UTF-8.in: Likewise.
    	* localedata/tk_TM.UTF-8.in: Likewise.
    	* localedata/tt_RU.UTF-8.in: Likewise.
    	* localedata/tt_RU.UTF-8@iqtelif.in: Likewise.
    	* localedata/ug_CN.UTF-8.in: Likewise.
    	* localedata/uz_UZ.UTF-8.in: Likewise.
    	* localedata/vi_VN.UTF-8.in: Likewise.
    	* localedata/yi_US.UTF-8.in: Likewise.
    	* localedata/yo_NG.UTF-8.in: Likewise.
    	* localedata/zh_CN.UTF-8.in: Likewise.
    	* localedata/locales/am_ET: Adapt collation rules to new iso14651_t1_common
            file and fix bugs in the collation.
    	* localedata/locales/az_AZ: Likewise.
    	* localedata/locales/be_BY: Likewise.
    	* localedata/locales/ber_DZ: Likewise.
    	* localedata/locales/ber_MA: Likewise.
    	* localedata/locales/bg_BG: Likewise.
    	* localedata/locales/br_FR: Likewise.
    	* localedata/locales/br_FR@euro: Likewise.
    	* localedata/locales/ca_ES: Likewise.
    	* localedata/locales/cns11643_stroke: Likewise.
    	* localedata/locales/crh_UA: Likewise.
    	* localedata/locales/cs_CZ: Likewise.
    	* localedata/locales/csb_PL: Likewise.
    	* localedata/locales/cv_RU: Likewise.
    	* localedata/locales/cy_GB: Likewise.
    	* localedata/locales/da_DK: Likewise.
    	* localedata/locales/dz_BT: Likewise.
    	* localedata/locales/en_CA: Likewise.
    	* localedata/locales/eo: Likewise.
    	* localedata/locales/es_CU: Likewise.
    	* localedata/locales/es_EC: Likewise.
    	* localedata/locales/es_ES: Likewise.
    	* localedata/locales/es_US: Likewise.
    	* localedata/locales/et_EE: Likewise.
    	* localedata/locales/fa_IR: Likewise.
    	* localedata/locales/fi_FI: Likewise.
    	* localedata/locales/fil_PH: Likewise.
    	* localedata/locales/fur_IT: Likewise.
    	* localedata/locales/gez_ER@abegede: Likewise.
    	* localedata/locales/ha_NG: Likewise.
    	* localedata/locales/hr_HR: Likewise.
    	* localedata/locales/hsb_DE: Likewise.
    	* localedata/locales/hu_HU: Likewise.
    	* localedata/locales/ig_NG: Likewise.
    	* localedata/locales/ik_CA: Likewise.
    	* localedata/locales/is_IS: Likewise.
    	* localedata/locales/iso14651_t1_pinyin: Likewise.
    	* localedata/locales/kk_KZ: Likewise.
    	* localedata/locales/ku_TR: Likewise.
    	* localedata/locales/ky_KG: Likewise.
    	* localedata/locales/ln_CD: Likewise.
    	* localedata/locales/lt_LT: Likewise.
    	* localedata/locales/lv_LV: Likewise.
    	* localedata/locales/mi_NZ: Likewise.
    	* localedata/locales/ml_IN: Likewise.
    	* localedata/locales/mn_MN: Likewise.
    	* localedata/locales/mr_IN: Likewise.
    	* localedata/locales/mt_MT: Likewise.
    	* localedata/locales/nb_NO: Likewise.
    	* localedata/locales/om_KE: Likewise.
    	* localedata/locales/os_RU: Likewise.
    	* localedata/locales/pl_PL: Likewise.
    	* localedata/locales/ps_AF: Likewise.
    	* localedata/locales/ro_RO: Likewise.
    	* localedata/locales/ru_RU: Likewise.
    	* localedata/locales/ru_UA: Likewise.
    	* localedata/locales/sc_IT: Likewise.
    	* localedata/locales/se_NO: Likewise.
    	* localedata/locales/si_LK: Likewise.
    	* localedata/locales/sq_AL: Likewise.
    	* localedata/locales/sv_FI: Likewise.
    	* localedata/locales/sv_FI@euro: Likewise.
    	* localedata/locales/sv_SE: Likewise.
    	* localedata/locales/szl_PL: Likewise.
    	* localedata/locales/tg_TJ: Likewise.
    	* localedata/locales/ti_ER: Likewise.
    	* localedata/locales/tk_TM: Likewise.
    	* localedata/locales/tl_PH: Likewise.
    	* localedata/locales/tr_TR: Likewise.
    	* localedata/locales/tt_RU: Likewise.
    	* localedata/locales/tt_RU@iqtelif: Likewise.
    	* localedata/locales/ug_CN: Likewise.
    	* localedata/locales/uk_UA: Likewise.
    	* localedata/locales/uz_UZ: Likewise.
    	* localedata/locales/uz_UZ@cyrillic: Likewise.
    	* localedata/locales/vi_VN: Likewise.
    	* localedata/locales/yi_US: Likewise.
    	* localedata/locales/yo_NG: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ce6636b06b67d6bb9b3d6927bf2a926b9b7478f5

commit ce6636b06b67d6bb9b3d6927bf2a926b9b7478f5
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Mon Jan 1 15:33:50 2018 +0100

    Improve gen-locales.mk and gen-locale.sh to make test files with @ options work
    
    With out this, adding collation test files like localedata/gez_ER.UTF-8@abegede.in
    does not work for locales which contain @ modifiers.
    
    	* gen-locales.mk: Make test files which contain @ modifiers in their
            name work.
    	* localedata/gen-locale.sh: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ac3a3b4b0d561d776b60317d6a926050c8541655

commit ac3a3b4b0d561d776b60317d6a926050c8541655
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 23 17:29:36 2018 +0100

    Fix test cases tst-fnmatch and tst-regexloc for the new iso14651_t1_common file.
    
    See:
    
    http://pubs.opengroup.org/onlinepubs/7908799/xbd/re.html
    
    > A range expression represents the set of collating elements that fall
    > between two elements in the current collation sequence,
    > inclusively. It is expressed as the starting point and the ending
    > point separated by a hyphen (-).
    >
    > Range expressions must not be used in portable applications because
    > their behaviour is dependent on the collating sequence. Ranges will be
    > treated according to the current collating sequence, and include such
    > characters that fall within the range based on that collating
    > sequence, regardless of character values. This, however, means that
    > the interpretation will differ depending on collating sequence. If,
    > for instance, one collating sequence defines ä as a variant of a,
    > while another defines it as a letter following z, then the expression
    > [ä-z] is valid in the first language and invalid in the second.
    
    Therefore, using [a-z] does not make much sense except in the C/POSIX locale.
    The new iso14651_t1_common lists upper case and  lower case Latin characters
    in a different order than the old one which causes surprising results
    for example in the de_DE locale: [a-z] now includes A because A comes
    after a in iso14651_t1_common but does not include Z because that comes
    after z in iso14651_t1_common.
    
    	* posix/tst-fnmatch.input: Fix results for range expressions
            for non C locales.
    	* posix/tst-regexloc.c: Do not use a range expression for
            de_DE.ISO-8859-1 locale.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=770cbe147cf33580e05ba6de78993c3070c5c2f8

commit 770cbe147cf33580e05ba6de78993c3070c5c2f8
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Fri Dec 15 07:19:45 2017 +0100

    Fix posix/bug-regex5.c test case, adapt to iso14651_t1_common upate
    
    This test case tests how many collating elements are defined in
    da_DK.ISO-8859-1 locale. The da_DK locale source defines 4:
    
    collating-element <A-A> from "<U0041><U0041>"
    collating-element <A-a> from "<U0041><U0061>"
    collating-element <a-A> from "<U0061><U0041>"
    collating-element <a-a> from "<U0061><U0061>"
    
    The new iso14651_t1_common file defines more collating elements, two
    of them are in the ISO-8859-1 range:
    
    collating-element <U004C_00B7> from "<U004C><U00B7>" % decomposition of LATIN CAPITAL LETTER L WITH MIDDLE DOT
    collating-element <U006C_00B7> from "<U006C><U00B7>" % decomposition of LATIN SMALL LETTER L WITH MIDDLE DOT
    
    So the total count is now 6 instead of 4.
    
    	* posix/bug-regex5.c: Fix test case because with the new
            iso14651_t1_common file, the da_DK locale now has 6 collating elements
            in the ISO-8859-1 range instead of 4 with the old iso14651_t1_common
            file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0fc355d9a7b3cc9d5e4190ce929e1eb4459ef0ea

commit 0fc355d9a7b3cc9d5e4190ce929e1eb4459ef0ea
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Wed Dec 13 14:39:54 2017 +0100

    Collation order of @-. and space has changed in new iso14651_t1_common file, adapt test files
    
    	* localedata/da_DK.ISO-8859-1.in: In the new iso14651_t1_common file
            downloaded from ISO, the collation order of @-. and space has changed.
            Therefore, this test file needed to be adapted.
    	* localedata/fr_CA.UTF-8.in: Likewise.
    	* localedata/fr_FR.UTF-8.in: Likewise.
    	* localedata/uk_UA.UTF-8.in: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=43f3893f4b5679cb9eb93300b18f7febd17e5239

commit 43f3893f4b5679cb9eb93300b18f7febd17e5239
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Dec 12 14:39:34 2017 +0100

    Collation order of ȥ has changed in new iso14651_t1_common file, adapt test files
    
    	* localedata/cs_CZ.UTF-8.in: adapt this test file to the collation
            order of ȥ in the new iso14651_t1_common file.
    	* localedata/pl_PL.UTF-8.in: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=df74ef786f9c87ce5404df3b68a91cb9d2c4c26f

commit df74ef786f9c87ce5404df3b68a91cb9d2c4c26f
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 15:45:05 2018 +0100

    Add sections for various scripts to the iso14651_t1_common file
    
    	* localedata/locales/iso14651_t1_common: Add sections for various
    	scripts to the iso14651_t1_common file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=d5adfbadd47e6836a7ddae54fba9f88e2b3354db

commit d5adfbadd47e6836a7ddae54fba9f88e2b3354db
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Wed Jan 31 06:18:47 2018 +0100

    iso14651_t1_common: make the fourth level the codepoint for characters which are ignorable on all 4 levels
    
    Entries for characters which have “IGNORE” on all 4 levels like:
    
     <U0001> IGNORE;IGNORE;IGNORE;IGNORE % START OF HEADING (in ISO 6429)
    
    are changed into:
    
     <U0001> IGNORE;IGNORE;IGNORE;<U0001> % START OF HEADING (in ISO 6429)
    
    i.e. putting the code point of the character into the fourth level
    instead of “IGNORE”. Without that change, all such characters
    would compare equal which would make a wcscoll test case fail.
    It is better to have a clearly defined sort order even for characters
    like this so it is good to use the code point as a tie-break.
    
    	* localedata/locales/iso14651_t1_common: Use the code point of a
            character in the fourth collation level instead of IGNORE for all
            entries which have IGNORE on all 4 levels.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5f5a96109187b4bb4a10b62139ab1c7fe45f7c1d

commit 5f5a96109187b4bb4a10b62139ab1c7fe45f7c1d
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Mon Dec 11 20:00:24 2017 +0100

    Add convenience symbols like <AFTER-A>, <BEFORE-A> to iso14651_t1_common
    
    	* localedata/locales/iso14651_t1_common: Add some convenient collation
    	symbols like <AFTER-A>, <BEFORE-A> to make tailoring easier using
    	rules similar to those in CLDR.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8a97e9002ffa807b49e1222e5a9d51ce7896f209

commit 8a97e9002ffa807b49e1222e5a9d51ce7896f209
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 18:24:47 2018 +0100

    Fixing syntax errors after updating the iso14651_t1_common file
    
    	* localedata/locales/iso14651_t1_common: The new version of this
    	file downloaded from ISO contained several syntax errors which
    	are fixed by this patch.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=bbdd2fba7d36d8f03c919b34f95238d8cf248b47

commit bbdd2fba7d36d8f03c919b34f95238d8cf248b47
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 18:07:39 2018 +0100

    iso14651_t1_common: <U\([0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]\)> → <U000\1>
    
    	* localedata/locales/iso14651_t1_common: replace all <U.....>
    	with <U000.....> because glibc understands only 4 digit or 8 digit

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1569e551aff088ed48e2694b07045256f3582271

commit 1569e551aff088ed48e2694b07045256f3582271
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 18:04:31 2018 +0100

    Necessary changes after updating the iso14651_t1_common file
    
    	* localedata/locales/iso14651_t1_common: Necessary changes
    	to make the file downloaded from ISO usable by glibc.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9479b6d5e08eacce06c6ab60abc9b2f4eb8b71e4

commit 9479b6d5e08eacce06c6ab60abc9b2f4eb8b71e4
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 17:59:00 2018 +0100

    Update iso14651_t1_common file to ISO14651_2016_TABLE1_en.txt [BZ #14095]
    
    [BZ #14095] - Review / update collation data from Unicode / ISO 14651
    
    File downloaded from:
    http://standards.iso.org/iso-iec/14651/ed-4/ISO14651_2016_TABLE1_en.txt
    
    Updating this file alone is not enough, there are problems in the new
    file which need to be fixed and the collation rules for many locales
    need to be adapted. This is done by the following patches.
    
    This update also fixes the problem that many characters are treated as
    identical when sorting because they were not yet in the old
    iso14651_t1_common file, see:
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1336308
    - Infinite (∞) and empty set (∅) are treated as if they were the same character by sort and uniq
    
    	[BZ #14095]
    	* localedata/locales/iso14651_t1_common: Update file to
    	latest version from ISO (ISO14651_2016_TABLE1_en.txt).

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                             |  224 +
 gen-locales.mk                        |    4 +-
 localedata/Makefile                   |  185 +-
 localedata/am_ET.UTF-8.in             |  347 +
 localedata/az_AZ.UTF-8.in             |   73 +
 localedata/be_BY.UTF-8.in             |   16 +
 localedata/ber_DZ.UTF-8.in            |   50 +
 localedata/ber_MA.UTF-8.in            |   13 +
 localedata/bg_BG.UTF-8.in             |   57 +
 localedata/br_FR.UTF-8.in             |   15 +
 localedata/cmn_TW.UTF-8.in            |75649 ++++++++++++++++++++++++++
 localedata/crh_UA.UTF-8.in            |   50 +
 localedata/cs_CZ.UTF-8.in             |    4 +-
 localedata/csb_PL.UTF-8.in            |   70 +
 localedata/cv_RU.UTF-8.in             |   45 +
 localedata/cy_GB.UTF-8.in             |   72 +
 localedata/da_DK.ISO-8859-1.in        |    4 +-
 localedata/dz_BT.UTF-8.in             |  789 +
 localedata/eo.UTF-8.in                |   32 +
 localedata/es_ES.UTF-8.in             |   46 +
 localedata/fa_IR.UTF-8.in             |   71 +
 localedata/fi_FI.UTF-8.in             |  140 +
 localedata/fil_PH.UTF-8.in            |   16 +
 localedata/fr_CA.UTF-8.in             |    9 +-
 localedata/fr_FR.UTF-8.in             |    9 +-
 localedata/fur_IT.UTF-8.in            |   12 +
 localedata/gen-locale.sh              |    5 +-
 localedata/gez_ER.UTF-8@abegede.in    |  365 +
 localedata/ha_NG.UTF-8.in             |   47 +
 localedata/ig_NG.UTF-8.in             |   93 +
 localedata/ik_CA.UTF-8.in             |   60 +
 localedata/kk_KZ.UTF-8.in             |   40 +
 localedata/ku_TR.UTF-8.in             |   52 +
 localedata/ky_KG.UTF-8.in             |   72 +
 localedata/ln_CD.UTF-8.in             |   18 +
 localedata/locales/am_ET              |  549 +-
 localedata/locales/az_AZ              |  201 +-
 localedata/locales/be_BY              |   41 +-
 localedata/locales/ber_DZ             |  173 +-
 localedata/locales/ber_MA             |   42 +-
 localedata/locales/bg_BG              |  290 +-
 localedata/locales/br_FR              |   55 +-
 localedata/locales/br_FR@euro         |    3 +-
 localedata/locales/ca_ES              |   16 +-
 localedata/locales/cns11643_stroke    |    9 +-
 localedata/locales/crh_UA             |  111 +-
 localedata/locales/cs_CZ              |   69 +-
 localedata/locales/csb_PL             |   83 +-
 localedata/locales/cv_RU              |   75 +-
 localedata/locales/cy_GB              |  242 +-
 localedata/locales/da_DK              |  116 +-
 localedata/locales/dz_BT              | 2484 +-
 localedata/locales/en_CA              |    8 -
 localedata/locales/eo                 |   69 +-
 localedata/locales/es_CU              |    3 +-
 localedata/locales/es_EC              |    2 +-
 localedata/locales/es_ES              |   49 +-
 localedata/locales/es_US              |   56 +-
 localedata/locales/et_EE              |   31 +-
 localedata/locales/fa_IR              |  287 +-
 localedata/locales/fi_FI              |  175 +-
 localedata/locales/fil_PH             |   57 +-
 localedata/locales/fur_IT             |   15 +-
 localedata/locales/gez_ER@abegede     |  409 +-
 localedata/locales/ha_NG              |  165 +-
 localedata/locales/hr_HR              |   84 +-
 localedata/locales/hsb_DE             |   64 +-
 localedata/locales/hu_HU              |  298 +-
 localedata/locales/ig_NG              |  453 +-
 localedata/locales/ik_CA              |  153 +-
 localedata/locales/is_IS              |   72 +-
 localedata/locales/iso14651_t1_common |94998 +++++++++++++++++++++++++++++----
 localedata/locales/iso14651_t1_pinyin |    9 +-
 localedata/locales/kk_KZ              |  132 +-
 localedata/locales/ku_TR              |   87 +-
 localedata/locales/ky_KG              |   59 +-
 localedata/locales/ln_CD              |   47 +-
 localedata/locales/lt_LT              |   52 +-
 localedata/locales/lv_LV              |   67 +-
 localedata/locales/mi_NZ              |   43 +-
 localedata/locales/ml_IN              |  158 +-
 localedata/locales/mn_MN              |   34 +-
 localedata/locales/mr_IN              |   76 +-
 localedata/locales/mt_MT              |  144 +-
 localedata/locales/nan_TW@latin       |   33 +-
 localedata/locales/nb_NO              |  120 +-
 localedata/locales/om_KE              |  120 +-
 localedata/locales/os_RU              |   14 +-
 localedata/locales/pl_PL              |   66 +-
 localedata/locales/ps_AF              |  224 +-
 localedata/locales/ro_RO              |   99 +-
 localedata/locales/ru_RU              |   24 +-
 localedata/locales/ru_UA              |   16 +-
 localedata/locales/sc_IT              |   15 +-
 localedata/locales/se_NO              |  298 +-
 localedata/locales/si_LK              |   42 +
 localedata/locales/sq_AL              |  291 +-
 localedata/locales/sv_FI              |    2 +-
 localedata/locales/sv_FI@euro         |    2 +-
 localedata/locales/sv_SE              |  113 +-
 localedata/locales/szl_PL             |   86 +-
 localedata/locales/tg_TJ              |  106 +-
 localedata/locales/ti_ER              |    2 +
 localedata/locales/tk_TM              |  399 +-
 localedata/locales/tl_PH              |   31 +-
 localedata/locales/tr_TR              |   47 +-
 localedata/locales/tt_RU              |  244 +-
 localedata/locales/tt_RU@iqtelif      |   14 +-
 localedata/locales/ug_CN              |  196 +-
 localedata/locales/uk_UA              |  487 +-
 localedata/locales/uz_UZ              |  131 +-
 localedata/locales/uz_UZ@cyrillic     |   56 +-
 localedata/locales/vi_VN              |  242 +-
 localedata/locales/yi_US              |  125 +-
 localedata/locales/yo_NG              |  365 +-
 localedata/lv_LV.UTF-8.in             |    6 +-
 localedata/mi_NZ.UTF-8.in             |   37 +
 localedata/ml_IN.UTF-8.in             |   25 +
 localedata/mn_MN.UTF-8.in             |   15 +
 localedata/mr_IN.UTF-8.in             |    9 +
 localedata/mt_MT.UTF-8.in             |   39 +
 localedata/nan_TW.UTF-8@latin.in      |   11 +
 localedata/nb_NO.UTF-8.in             |   66 +
 localedata/om_KE.UTF-8.in             |   36 +
 localedata/os_RU.UTF-8.in             |    9 +
 localedata/pl_PL.UTF-8.in             |    4 +-
 localedata/ps_AF.UTF-8.in             |   61 +
 localedata/ro_RO.UTF-8.in             |   32 +
 localedata/ru_RU.UTF-8.in             |   15 +
 localedata/sc_IT.UTF-8.in             |   12 +
 localedata/se_NO.UTF-8.in             |  144 +
 localedata/sq_AL.UTF-8.in             |   82 +
 localedata/sv_SE.ISO-8859-1.in        |   10 +-
 localedata/sv_SE.UTF-8.in             |  107 +
 localedata/szl_PL.UTF-8.in            |   49 +
 localedata/tg_TJ.UTF-8.in             |  105 +
 localedata/tk_TM.UTF-8.in             |  213 +
 localedata/tt_RU.UTF-8.in             |  194 +
 localedata/tt_RU.UTF-8@iqtelif.in     |   53 +
 localedata/ug_CN.UTF-8.in             |   16 +
 localedata/uk_UA.UTF-8.in             |   18 +-
 localedata/uz_UZ.UTF-8.in             |   26 +
 localedata/vi_VN.UTF-8.in             |   45 +
 localedata/yi_US.UTF-8.in             |   39 +
 localedata/yo_NG.UTF-8.in             |   30 +
 localedata/zh_CN.UTF-8.in             |25498 +++++++++
 posix/bug-regex5.c                    |    4 +-
 posix/tst-fnmatch.input               |   58 +-
 posix/tst-regexloc.c                  |    4 +-
 149 files changed, 197751 insertions(+), 15000 deletions(-)
 create mode 100644 localedata/am_ET.UTF-8.in
 create mode 100644 localedata/az_AZ.UTF-8.in
 create mode 100644 localedata/be_BY.UTF-8.in
 create mode 100644 localedata/ber_DZ.UTF-8.in
 create mode 100644 localedata/ber_MA.UTF-8.in
 create mode 100644 localedata/bg_BG.UTF-8.in
 create mode 100644 localedata/br_FR.UTF-8.in
 create mode 100644 localedata/cmn_TW.UTF-8.in
 create mode 100644 localedata/crh_UA.UTF-8.in
 create mode 100644 localedata/csb_PL.UTF-8.in
 create mode 100644 localedata/cv_RU.UTF-8.in
 create mode 100644 localedata/cy_GB.UTF-8.in
 create mode 100644 localedata/dz_BT.UTF-8.in
 create mode 100644 localedata/eo.UTF-8.in
 create mode 100644 localedata/es_ES.UTF-8.in
 create mode 100644 localedata/fa_IR.UTF-8.in
 create mode 100644 localedata/fi_FI.UTF-8.in
 create mode 100644 localedata/fil_PH.UTF-8.in
 create mode 100644 localedata/fur_IT.UTF-8.in
 create mode 100644 localedata/gez_ER.UTF-8@abegede.in
 create mode 100644 localedata/ha_NG.UTF-8.in
 create mode 100644 localedata/ig_NG.UTF-8.in
 create mode 100644 localedata/ik_CA.UTF-8.in
 create mode 100644 localedata/kk_KZ.UTF-8.in
 create mode 100644 localedata/ku_TR.UTF-8.in
 create mode 100644 localedata/ky_KG.UTF-8.in
 create mode 100644 localedata/ln_CD.UTF-8.in
 create mode 100644 localedata/mi_NZ.UTF-8.in
 create mode 100644 localedata/ml_IN.UTF-8.in
 create mode 100644 localedata/mn_MN.UTF-8.in
 create mode 100644 localedata/mr_IN.UTF-8.in
 create mode 100644 localedata/mt_MT.UTF-8.in
 create mode 100644 localedata/nan_TW.UTF-8@latin.in
 create mode 100644 localedata/nb_NO.UTF-8.in
 create mode 100644 localedata/om_KE.UTF-8.in
 create mode 100644 localedata/os_RU.UTF-8.in
 create mode 100644 localedata/ps_AF.UTF-8.in
 create mode 100644 localedata/ro_RO.UTF-8.in
 create mode 100644 localedata/ru_RU.UTF-8.in
 create mode 100644 localedata/sc_IT.UTF-8.in
 create mode 100644 localedata/se_NO.UTF-8.in
 create mode 100644 localedata/sq_AL.UTF-8.in
 create mode 100644 localedata/sv_SE.UTF-8.in
 create mode 100644 localedata/szl_PL.UTF-8.in
 create mode 100644 localedata/tg_TJ.UTF-8.in
 create mode 100644 localedata/tk_TM.UTF-8.in
 create mode 100644 localedata/tt_RU.UTF-8.in
 create mode 100644 localedata/tt_RU.UTF-8@iqtelif.in
 create mode 100644 localedata/ug_CN.UTF-8.in
 create mode 100644 localedata/uz_UZ.UTF-8.in
 create mode 100644 localedata/vi_VN.UTF-8.in
 create mode 100644 localedata/yi_US.UTF-8.in
 create mode 100644 localedata/yo_NG.UTF-8.in
 create mode 100644 localedata/zh_CN.UTF-8.in
Comment 11 Mike FABIAN 2018-02-28 14:11:12 UTC
Fixed.
Comment 12 cvs-commit@gcc.gnu.org 2018-03-02 12:59:01 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, mfabian/collation-update-2.27 has been created
        at  9589174d076327deb7ed816d16b89b0e7470abd6 (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9589174d076327deb7ed816d16b89b0e7470abd6

commit 9589174d076327deb7ed816d16b89b0e7470abd6
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Thu Dec 21 18:56:52 2017 +0100

    Remove the lines from cmn_TW.UTF-8.in which cannot work at the moment.
    
    See this bug https://sourceware.org/bugzilla/show_bug.cgi?id=22898
    
    These lines don’t yet work because of a glibc bug, not because of
    problems in the locale data. No matter what sorting rules one uses,
    these characters cannot be sorted at all at the moment.
    
    As soon as that bug is fixed, these lines should be added back to the
    test file.
    
    	* localedata/cmn_TW.UTF-8.in: Remove the lines which cannot
            be sorted correctly at the moment because of a bug.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e289a7d4c7f2abf09e4a4877b8cadcded7440e55

commit e289a7d4c7f2abf09e4a4877b8cadcded7440e55
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Mon Dec 11 18:26:22 2017 +0100

    Adapt collation in several locales to the new iso14651_t1_common file
    
    [BZ #22550] - es_ES locale (and other es_* locales): collation should
    treat ñ as a primary different character, sync the collation
    for Spanish with CLDR
    [BZ #21547] - Tibetan script collation broken (Dzongkha and Tibetan)
    
    	* localedata/Makefile: Add new test files.
    	* localedata/lv_LV.UTF-8.in: Adapt test file to new collation order.
    	* localedata/sv_SE.ISO-8859-1.in: Adapt test file to new collation order.
    	* localedata/uk_UA.UTF-8.in: Adapt test file to new collation order.
    	* localedata/am_ET.UTF-8.in: New test file.
    	* localedata/az_AZ.UTF-8.in: Likewise.
    	* localedata/be_BY.UTF-8.in: Likewise.
    	* localedata/ber_DZ.UTF-8.in: Likewise.
    	* localedata/ber_MA.UTF-8.in: Likewise.
    	* localedata/bg_BG.UTF-8.in: Likewise.
    	* localedata/br_FR.UTF-8.in: Likewise.
    	* localedata/cmn_TW.UTF-8.in: Likewise.
    	* localedata/crh_UA.UTF-8.in: Likewise.
    	* localedata/csb_PL.UTF-8.in: Likewise.
    	* localedata/cv_RU.UTF-8.in: Likewise.
    	* localedata/cy_GB.UTF-8.in: Likewise.
    	* localedata/dz_BT.UTF-8.in: Likewise.
    	* localedata/eo.UTF-8.in: Likewise.
    	* localedata/es_ES.UTF-8.in: Likewise.
    	* localedata/fa_IR.UTF-8.in: Likewise.
    	* localedata/fi_FI.UTF-8.in: Likewise.
    	* localedata/fil_PH.UTF-8.in: Likewise.
    	* localedata/fur_IT.UTF-8.in: Likewise.
    	* localedata/gez_ER.UTF-8@abegede.in: Likewise.
    	* localedata/ha_NG.UTF-8.in: Likewise.
    	* localedata/ig_NG.UTF-8.in: Likewise.
    	* localedata/ik_CA.UTF-8.in: Likewise.
    	* localedata/kk_KZ.UTF-8.in: Likewise.
    	* localedata/ku_TR.UTF-8.in: Likewise.
    	* localedata/ky_KG.UTF-8.in: Likewise.
    	* localedata/ln_CD.UTF-8.in: Likewise.
    	* localedata/mi_NZ.UTF-8.in: Likewise.
    	* localedata/ml_IN.UTF-8.in: Likewise.
    	* localedata/mn_MN.UTF-8.in: Likewise.
    	* localedata/mr_IN.UTF-8.in: Likewise.
    	* localedata/mt_MT.UTF-8.in: Likewise.
    	* localedata/nb_NO.UTF-8.in: Likewise.
    	* localedata/om_KE.UTF-8.in: Likewise.
    	* localedata/os_RU.UTF-8.in: Likewise.
    	* localedata/ps_AF.UTF-8.in: Likewise.
    	* localedata/ro_RO.UTF-8.in: Likewise.
    	* localedata/ru_RU.UTF-8.in: Likewise.
    	* localedata/sc_IT.UTF-8.in: Likewise.
    	* localedata/se_NO.UTF-8.in: Likewise.
    	* localedata/sq_AL.UTF-8.in: Likewise.
    	* localedata/sv_SE.UTF-8.in: Likewise.
    	* localedata/szl_PL.UTF-8.in: Likewise.
    	* localedata/tg_TJ.UTF-8.in: Likewise.
    	* localedata/tk_TM.UTF-8.in: Likewise.
    	* localedata/tt_RU.UTF-8.in: Likewise.
    	* localedata/tt_RU.UTF-8@iqtelif.in: Likewise.
    	* localedata/ug_CN.UTF-8.in: Likewise.
    	* localedata/uz_UZ.UTF-8.in: Likewise.
    	* localedata/vi_VN.UTF-8.in: Likewise.
    	* localedata/yi_US.UTF-8.in: Likewise.
    	* localedata/yo_NG.UTF-8.in: Likewise.
    	* localedata/zh_CN.UTF-8.in: Likewise.
    	* localedata/locales/am_ET: Adapt collation rules to new iso14651_t1_common
            file and fix bugs in the collation.
    	* localedata/locales/az_AZ: Likewise.
    	* localedata/locales/be_BY: Likewise.
    	* localedata/locales/ber_DZ: Likewise.
    	* localedata/locales/ber_MA: Likewise.
    	* localedata/locales/bg_BG: Likewise.
    	* localedata/locales/br_FR: Likewise.
    	* localedata/locales/br_FR@euro: Likewise.
    	* localedata/locales/ca_ES: Likewise.
    	* localedata/locales/cns11643_stroke: Likewise.
    	* localedata/locales/crh_UA: Likewise.
    	* localedata/locales/cs_CZ: Likewise.
    	* localedata/locales/csb_PL: Likewise.
    	* localedata/locales/cv_RU: Likewise.
    	* localedata/locales/cy_GB: Likewise.
    	* localedata/locales/da_DK: Likewise.
    	* localedata/locales/dz_BT: Likewise.
    	* localedata/locales/en_CA: Likewise.
    	* localedata/locales/eo: Likewise.
    	* localedata/locales/es_CU: Likewise.
    	* localedata/locales/es_EC: Likewise.
    	* localedata/locales/es_ES: Likewise.
    	* localedata/locales/es_US: Likewise.
    	* localedata/locales/et_EE: Likewise.
    	* localedata/locales/fa_IR: Likewise.
    	* localedata/locales/fi_FI: Likewise.
    	* localedata/locales/fil_PH: Likewise.
    	* localedata/locales/fur_IT: Likewise.
    	* localedata/locales/gez_ER@abegede: Likewise.
    	* localedata/locales/ha_NG: Likewise.
    	* localedata/locales/hr_HR: Likewise.
    	* localedata/locales/hsb_DE: Likewise.
    	* localedata/locales/hu_HU: Likewise.
    	* localedata/locales/ig_NG: Likewise.
    	* localedata/locales/ik_CA: Likewise.
    	* localedata/locales/is_IS: Likewise.
    	* localedata/locales/iso14651_t1_pinyin: Likewise.
    	* localedata/locales/kk_KZ: Likewise.
    	* localedata/locales/ku_TR: Likewise.
    	* localedata/locales/ky_KG: Likewise.
    	* localedata/locales/ln_CD: Likewise.
    	* localedata/locales/lt_LT: Likewise.
    	* localedata/locales/lv_LV: Likewise.
    	* localedata/locales/mi_NZ: Likewise.
    	* localedata/locales/ml_IN: Likewise.
    	* localedata/locales/mn_MN: Likewise.
    	* localedata/locales/mr_IN: Likewise.
    	* localedata/locales/mt_MT: Likewise.
    	* localedata/locales/nb_NO: Likewise.
    	* localedata/locales/om_KE: Likewise.
    	* localedata/locales/os_RU: Likewise.
    	* localedata/locales/pl_PL: Likewise.
    	* localedata/locales/ps_AF: Likewise.
    	* localedata/locales/ro_RO: Likewise.
    	* localedata/locales/ru_RU: Likewise.
    	* localedata/locales/ru_UA: Likewise.
    	* localedata/locales/sc_IT: Likewise.
    	* localedata/locales/se_NO: Likewise.
    	* localedata/locales/si_LK: Likewise.
    	* localedata/locales/sq_AL: Likewise.
    	* localedata/locales/sv_FI: Likewise.
    	* localedata/locales/sv_FI@euro: Likewise.
    	* localedata/locales/sv_SE: Likewise.
    	* localedata/locales/szl_PL: Likewise.
    	* localedata/locales/tg_TJ: Likewise.
    	* localedata/locales/ti_ER: Likewise.
    	* localedata/locales/tk_TM: Likewise.
    	* localedata/locales/tl_PH: Likewise.
    	* localedata/locales/tr_TR: Likewise.
    	* localedata/locales/tt_RU: Likewise.
    	* localedata/locales/tt_RU@iqtelif: Likewise.
    	* localedata/locales/ug_CN: Likewise.
    	* localedata/locales/uk_UA: Likewise.
    	* localedata/locales/uz_UZ: Likewise.
    	* localedata/locales/uz_UZ@cyrillic: Likewise.
    	* localedata/locales/vi_VN: Likewise.
    	* localedata/locales/yi_US: Likewise.
    	* localedata/locales/yo_NG: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=242596394db9dad6147bb2b7bcb53d8a7610e1d0

commit 242596394db9dad6147bb2b7bcb53d8a7610e1d0
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Mon Jan 1 15:33:50 2018 +0100

    Improve gen-locales.mk and gen-locale.sh to make test files with @ options work
    
    With out this, adding collation test files like localedata/gez_ER.UTF-8@abegede.in
    does not work for locales which contain @ modifiers.
    
    	* gen-locales.mk: Make test files which contain @ modifiers in their
            name work.
    	* localedata/gen-locale.sh: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=cc5351f2c0502826f8b4143f3646d44e334ff7b8

commit cc5351f2c0502826f8b4143f3646d44e334ff7b8
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 23 17:29:36 2018 +0100

    Fix test cases tst-fnmatch and tst-regexloc for the new iso14651_t1_common file.
    
    See:
    
    http://pubs.opengroup.org/onlinepubs/7908799/xbd/re.html
    
    > A range expression represents the set of collating elements that fall
    > between two elements in the current collation sequence,
    > inclusively. It is expressed as the starting point and the ending
    > point separated by a hyphen (-).
    >
    > Range expressions must not be used in portable applications because
    > their behaviour is dependent on the collating sequence. Ranges will be
    > treated according to the current collating sequence, and include such
    > characters that fall within the range based on that collating
    > sequence, regardless of character values. This, however, means that
    > the interpretation will differ depending on collating sequence. If,
    > for instance, one collating sequence defines ä as a variant of a,
    > while another defines it as a letter following z, then the expression
    > [ä-z] is valid in the first language and invalid in the second.
    
    Therefore, using [a-z] does not make much sense except in the C/POSIX locale.
    The new iso14651_t1_common lists upper case and  lower case Latin characters
    in a different order than the old one which causes surprising results
    for example in the de_DE locale: [a-z] now includes A because A comes
    after a in iso14651_t1_common but does not include Z because that comes
    after z in iso14651_t1_common.
    
    	* posix/tst-fnmatch.input: Fix results for range expressions
            for non C locales.
    	* posix/tst-regexloc.c: Do not use a range expression for
            de_DE.ISO-8859-1 locale.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ffa8106c727607fb365f2b93649fe3ea182dffe4

commit ffa8106c727607fb365f2b93649fe3ea182dffe4
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Fri Dec 15 07:19:45 2017 +0100

    Fix posix/bug-regex5.c test case, adapt to iso14651_t1_common upate
    
    This test case tests how many collating elements are defined in
    da_DK.ISO-8859-1 locale. The da_DK locale source defines 4:
    
    collating-element <A-A> from "<U0041><U0041>"
    collating-element <A-a> from "<U0041><U0061>"
    collating-element <a-A> from "<U0061><U0041>"
    collating-element <a-a> from "<U0061><U0061>"
    
    The new iso14651_t1_common file defines more collating elements, two
    of them are in the ISO-8859-1 range:
    
    collating-element <U004C_00B7> from "<U004C><U00B7>" % decomposition of LATIN CAPITAL LETTER L WITH MIDDLE DOT
    collating-element <U006C_00B7> from "<U006C><U00B7>" % decomposition of LATIN SMALL LETTER L WITH MIDDLE DOT
    
    So the total count is now 6 instead of 4.
    
    	* posix/bug-regex5.c: Fix test case because with the new
            iso14651_t1_common file, the da_DK locale now has 6 collating elements
            in the ISO-8859-1 range instead of 4 with the old iso14651_t1_common
            file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=61e613fb97aa619ae4fabac3f106d5fffe15eacb

commit 61e613fb97aa619ae4fabac3f106d5fffe15eacb
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Wed Dec 13 14:39:54 2017 +0100

    Collation order of @-. and space has changed in new iso14651_t1_common file, adapt test files
    
    	* localedata/da_DK.ISO-8859-1.in: In the new iso14651_t1_common file
            downloaded from ISO, the collation order of @-. and space has changed.
            Therefore, this test file needed to be adapted.
    	* localedata/fr_CA.UTF-8.in: Likewise.
    	* localedata/fr_FR.UTF-8.in: Likewise.
    	* localedata/uk_UA.UTF-8.in: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=059454de60bdb1be9979ee09596c1e9a7e9e6c8b

commit 059454de60bdb1be9979ee09596c1e9a7e9e6c8b
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Dec 12 14:39:34 2017 +0100

    Collation order of ȥ has changed in new iso14651_t1_common file, adapt test files
    
    	* localedata/cs_CZ.UTF-8.in: adapt this test file to the collation
            order of ȥ in the new iso14651_t1_common file.
    	* localedata/pl_PL.UTF-8.in: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1f4df3bb2ac69f2e1947c2953379a7f19b5f0c35

commit 1f4df3bb2ac69f2e1947c2953379a7f19b5f0c35
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 15:45:05 2018 +0100

    Add sections for various scripts to the iso14651_t1_common file
    
    	* localedata/locales/iso14651_t1_common: Add sections for various
    	scripts to the iso14651_t1_common file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a93fecdcece3e2178834f4b4868b2309b0158753

commit a93fecdcece3e2178834f4b4868b2309b0158753
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Wed Jan 31 06:18:47 2018 +0100

    iso14651_t1_common: make the fourth level the codepoint for characters which are ignorable on all 4 levels
    
    Entries for characters which have “IGNORE” on all 4 levels like:
    
     <U0001> IGNORE;IGNORE;IGNORE;IGNORE % START OF HEADING (in ISO 6429)
    
    are changed into:
    
     <U0001> IGNORE;IGNORE;IGNORE;<U0001> % START OF HEADING (in ISO 6429)
    
    i.e. putting the code point of the character into the fourth level
    instead of “IGNORE”. Without that change, all such characters
    would compare equal which would make a wcscoll test case fail.
    It is better to have a clearly defined sort order even for characters
    like this so it is good to use the code point as a tie-break.
    
    	* localedata/locales/iso14651_t1_common: Use the code point of a
            character in the fourth collation level instead of IGNORE for all
            entries which have IGNORE on all 4 levels.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=3e7089bf28ed1fd77e644bb3ce7405aff7847e61

commit 3e7089bf28ed1fd77e644bb3ce7405aff7847e61
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Mon Dec 11 20:00:24 2017 +0100

    Add convenience symbols like <AFTER-A>, <BEFORE-A> to iso14651_t1_common
    
    	* localedata/locales/iso14651_t1_common: Add some convenient collation
    	symbols like <AFTER-A>, <BEFORE-A> to make tailoring easier using
    	rules similar to those in CLDR.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=50a54ba443575e69ffb03aa67d53ccf8b66a4fbd

commit 50a54ba443575e69ffb03aa67d53ccf8b66a4fbd
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 18:24:47 2018 +0100

    Fixing syntax errors after updating the iso14651_t1_common file
    
    	* localedata/locales/iso14651_t1_common: The new version of this
    	file downloaded from ISO contained several syntax errors which
    	are fixed by this patch.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=661ab21c7521ba8e6e8bc7dad897b6cf162e0cd0

commit 661ab21c7521ba8e6e8bc7dad897b6cf162e0cd0
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 18:07:39 2018 +0100

    iso14651_t1_common: <U\([0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]\)> → <U000\1>
    
    	* localedata/locales/iso14651_t1_common: replace all <U.....>
    	with <U000.....> because glibc understands only 4 digit or 8 digit

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=06061c30d615b2862ac360f11384092c92022ea7

commit 06061c30d615b2862ac360f11384092c92022ea7
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 18:04:31 2018 +0100

    Necessary changes after updating the iso14651_t1_common file
    
    	* localedata/locales/iso14651_t1_common: Necessary changes
    	to make the file downloaded from ISO usable by glibc.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=bc1d41044c0cf9f0214acdbfd79b6cd11fd1e8c1

commit bc1d41044c0cf9f0214acdbfd79b6cd11fd1e8c1
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jan 30 17:59:00 2018 +0100

    Update iso14651_t1_common file to ISO14651_2016_TABLE1_en.txt [BZ #14095]
    
    [BZ #14095] - Review / update collation data from Unicode / ISO 14651
    
    File downloaded from:
    http://standards.iso.org/iso-iec/14651/ed-4/ISO14651_2016_TABLE1_en.txt
    
    Updating this file alone is not enough, there are problems in the new
    file which need to be fixed and the collation rules for many locales
    need to be adapted. This is done by the following patches.
    
    This update also fixes the problem that many characters are treated as
    identical when sorting because they were not yet in the old
    iso14651_t1_common file, see:
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1336308
    - Infinite (∞) and empty set (∅) are treated as if they were the same character by sort and uniq
    
    	[BZ #14095]
    	* localedata/locales/iso14651_t1_common: Update file to
    	latest version from ISO (ISO14651_2016_TABLE1_en.txt).

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=16e349c550942d274d3193ccedaa88855e3ac690

commit 16e349c550942d274d3193ccedaa88855e3ac690
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Fri Mar 2 11:29:24 2018 +0100

    Remove --quiet argument when installing locales
    
    Using this argument hides problems. I would like to see when something fails.
    
            * localedata/Makefile: Remove --quiet argument when
            installing locales

-----------------------------------------------------------------------