http://www.unicode.org/reports/tr10/tr10-30.html states: <quote> Normally, all differences in sorting are assessed from the start to the end of the string. If all of the base letters are the same, the first accent difference determines the final order. In row 1 of Table 5, the first accent difference is on the o, so that is what determines the order. In some French dictionary ordering traditions, however, it is the last accent difference that determines the order, as shown in row 2. </quote> Table 5 says: <pre> Normal Accent Ordering cote < coté < côte < côté Backward Accent Ordering cote < côte < coté < côté </pre> However, glibc implements backward accent ordering for all locales except de_DE and lb_LU. Unicode CLDR 26 confirms this is wrong: the only file in http://unicode.org/cldr/trac/browser/tags/release-26/common/collation/ that has settings backwards="on" is fr_CA.xml.
Mine. I posted a patch at https://sourceware.org/ml/libc-alpha/2014-12/msg00524.html
On Tue, Dec 23, 2014 at 04:25:27AM +0000, aoliva at sourceware dot org wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > Bug ID: 17750 > Summary: wrong collation order of diacritics in most locales > Product: glibc > Version: unspecified > Status: NEW > Severity: normal > Priority: P2 > Component: localedata > Assignee: unassigned at sourceware dot org > Reporter: aoliva at sourceware dot org > CC: libc-locales at sourceware dot org > > http://www.unicode.org/reports/tr10/tr10-30.html states: > > <quote> > Normally, all differences in sorting are assessed from the start to the end of > the string. If all of the base letters are the same, the first accent > difference determines the final order. In row 1 of Table 5, the first accent > difference is on the o, so that is what determines the order. In some French > dictionary ordering traditions, however, it is the last accent difference that > determines the order, as shown in row 2. > </quote> > > Table 5 says: > > <pre> > Normal Accent Ordering cote < coté < côte < côté > Backward Accent Ordering cote < côte < coté < côté > </pre> > > However, glibc implements backward accent ordering for all locales except de_DE > and lb_LU. > > Unicode CLDR 26 confirms this is wrong: the only file in > http://unicode.org/cldr/trac/browser/tags/release-26/common/collation/ that has > settings backwards="on" is fr_CA.xml. This was probably done because if there are more than one accented letter in a string, the word or name is probably French, and then the french rules should be followed. This would mean that CLDR is wrong. Best regards Keld
Even if your assumption that more than one diacritic in a word implied the word was in French, there are various other points that make your suggestion flawed. First of all, the forward or backward accent ordering doesn't even apply to all French speakers. Second, there are words with more than one diacritic in other languages. I happen to be a native speaker of one such language. Third, you don't need more than one diacritic in a word to trigger the problem. Consider Cortes, Córtes, and Cortés; pelo, pêlo, pelô; Schlagerforderung, Schlagerförderung, Schlägerforderung, Schlägerförderung. Fourth, Unicode and CLDR are the result of a lot of work by a lot of people who study lots of languages and local customs. It would take a lot more than groundless speculation to conclude they're wrong. (Which is not to say they're perfect in all regards, of course ;-)
(In reply to Alexandre Oliva from comment #3) > Even if your assumption that more than one diacritic in a word implied the > word was in French, there are various other points that make your suggestion > flawed. > > First of all, the forward or backward accent ordering doesn't even apply to > all French speakers. > > Second, there are words with more than one diacritic in other languages. I > happen to be a native speaker of one such language. > > Third, you don't need more than one diacritic in a word to trigger the > problem. Consider Cortes, Córtes, and Cortés; pelo, pêlo, pelô; > Schlagerforderung, Schlagerförderung, Schlägerforderung, Schlägerförderung. > > Fourth, Unicode and CLDR are the result of a lot of work by a lot of people > who study lots of languages and local customs. It would take a lot more > than groundless speculation to conclude they're wrong. (Which is not to say > they're perfect in all regards, of course ;-) I agree with Alex. We would need a very detailed analysis of why CLDR is wrong to ignore their implementation and do something different.
On Tue, Dec 23, 2014 at 11:00:50PM +0000, aoliva at sourceware dot org wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > --- Comment #3 from Alexandre Oliva <aoliva at sourceware dot org> --- > Even if your assumption that more than one diacritic in a word implied the word > was in French, there are various other points that make your suggestion flawed. > > First of all, the forward or backward accent ordering doesn't even apply to all > French speakers. > > Second, there are words with more than one diacritic in other languages. I > happen to be a native speaker of one such language. > > Third, you don't need more than one diacritic in a word to trigger the problem. > Consider Cortes, Córtes, and Cortés; pelo, pêlo, pelô; Schlagerforderung, > Schlagerförderung, Schlägerforderung, Schlägerförderung. > > Fourth, Unicode and CLDR are the result of a lot of work by a lot of people who > study lots of languages and local customs. It would take a lot more than > groundless speculation to conclude they're wrong. (Which is not to say they're > perfect in all regards, of course ;-) 1. Which french speakers does not use the backward accent ordering? I do have access to some of the sorting experts from the French community. 2. I see that for some languages, eg. German, it makes sense to use forward ordering on accents. Which languages would that apply to? 3. Yes, I see that there may be just one accent in some strings, and then the ordering depends om the position. I was involved in the current recommendation to use backward ordering in the default tables And I was not the only one, and the recommendation came out of the sorting experts in ISO and I believe also in CEN. 4. Well, CLDR does not have more ressources that we have. And they are known not to listen to other expertise than their own. Best regards Keld
Fixing this will change the sort order of existing data, which is quite risky. Is it really worth it?
(In reply to Florian Weimer from comment #6) > Fixing this will change the sort order of existing data, which is quite > risky. Is it really worth it? For the long term support of locales it must change. Unless we get more maintainers my plan is to conintue to push that we match CLDR, UNICODE and thus exactly what libicu does and reduce the "surprise" for developers going from java to C/C++ or vice-versa.
(In reply to Carlos O'Donell from comment #7) > For the long term support of locales it must change. Unless we get more > maintainers my plan is to conintue to push that we match CLDR, UNICODE and > thus exactly what libicu does and reduce the "surprise" for developers going > from java to C/C++ or vice-versa. It would be possible to rename the locale each time the ordering changes (and change the environment settings), which might satisfy both needs (fixed locales for interactive use, predictable ordering for data at rest).
On Thu, Jan 29, 2015 at 02:35:11PM +0000, carlos at redhat dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > --- Comment #7 from Carlos O'Donell <carlos at redhat dot com> --- > (In reply to Florian Weimer from comment #6) > > Fixing this will change the sort order of existing data, which is quite > > risky. Is it really worth it? > > For the long term support of locales it must change. Unless we get more > maintainers my plan is to conintue to push that we match CLDR, UNICODE and thus > exactly what libicu does and reduce the "surprise" for developers going from > java to C/C++ or vice-versa. The fix is wrong, IMHO. Best regards Keld
(In reply to keld@keldix.com from comment #9) > On Thu, Jan 29, 2015 at 02:35:11PM +0000, carlos at redhat dot com wrote: > > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > > > --- Comment #7 from Carlos O'Donell <carlos at redhat dot com> --- > > (In reply to Florian Weimer from comment #6) > > > Fixing this will change the sort order of existing data, which is quite > > > risky. Is it really worth it? > > > > For the long term support of locales it must change. Unless we get more > > maintainers my plan is to conintue to push that we match CLDR, UNICODE and thus > > exactly what libicu does and reduce the "surprise" for developers going from > > java to C/C++ or vice-versa. > > The fix is wrong, IMHO. Thanks for stating that. In this case we'll need to discuss why it's wrong and try to come to a consensus, including talking to CLDR about it. Thus this issue is going to be more work, but not impossible.
This change broke (among others) the Hungarian locales (see 18934). I totally agree with Alexandre's opinion (the assumptions made by the patch being wrong on so many levels); extending with a fifth one: Even if there are some French words present in a list, if you're using a certain language then the alphabetical rules of that language should apply, not the French one. This is what locale definitions are about. Define in the French locales the way to sort words on a French UI, but please leave the other locales alone. I'm disappointed that such a change that was doomed to break so many locales managed to make it into glibc. But I think that in the end it boils down to the lack of proper unittest coverage. In the above mentioned bug I created an extensive unittest for Hungarian, one that points to the official rules of alphabetical sorting and takes the examples from that (plus many more), and would have failed with this change. I encourage maintainers of locale files to come up with similarly extensive unittests.
Sorry, make it a link: bug 18934.
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via ea1898dded26316e2e73adfb409224e864ffaa8b (commit) from 78c05814320cdc3377347f8e5fdbaa7cf5abf5b5 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ea1898dded26316e2e73adfb409224e864ffaa8b commit ea1898dded26316e2e73adfb409224e864ffaa8b Author: Egmont Koblinger <egmont@gmail.com> Date: Wed Mar 22 21:27:30 2017 -0400 localedata: hu_HU: fix multiple sorting bugs (bug 18934) Fix the incorrect sorting order of a digraph and its geminated variant, regression introduced by a faulty fix to bug 13547 in commit b008d4c85619a753e441d7f473ba8af0db400bd6. Fix two inconsistencies in sorting unusual capitalization of digraphs (bug #18587). Enable DIACRIT_FORWARD to work around bug #17750. Sort foreign accents after the Hungarian ones. Add extensive unittests containing all the examples from The Rules of Hungarian Orthography and many more, including explanatory comments. ----------------------------------------------------------------------- Summary of changes: NEWS | 4 + localedata/ChangeLog | 7 + localedata/Makefile | 4 +- localedata/hu_HU.in | 560 ++++++++++++++++++++++++++++++++++++++++++++++ localedata/locales/hu_HU | 286 ++++++++++++------------ 5 files changed, 716 insertions(+), 145 deletions(-) create mode 100644 localedata/hu_HU.in
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via 8da25eec0aaf4d86a06088fff8d175989835e071 (commit) from a55430cb0e261834ce7a4e118dd9e0f2b7fb14bc (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8da25eec0aaf4d86a06088fff8d175989835e071 commit 8da25eec0aaf4d86a06088fff8d175989835e071 Author: Alexandre Oliva <aoliva@redhat.com> Date: Tue Nov 28 16:23:02 2017 +0100 Collation fix: make forward accent sorting the default [BZ #17750] [BZ #17750] * Makefile: add fr_CA.UTF-8 to test-input and LOCALES. * localedata/fr_CA.UTF-8.in: New file with test data for backward accents sorting. * localedata/fr_FR.UTF-8.in: Fix test data for forward accents sorting. * localedata/locales/cs_CZ (LC_COLLATE): Remove “define DIACRIT_FORWARD” * localedata/locales/de_DE (LC_COLLATE): Likewise. * localedata/locales/hu_HU (LC_COLLATE): Likewise. * localedata/locales/lb_LU (LC_COLLATE): Likewise. * localedata/locales/yuw_PG (LC_COLLATE): Likewise. * localedata/locales/fr_CA (LC_COLLATE): Add “define DIACRIT_BACKWARD” * localedata/locales/iso14651_t1_common: Use “ifdef DIACRIT_FORWARD” instead of “ifdef DIACRIT_BACKWARD”. The only locale which currently needs backward accents sorting is fr_CA. Therefore, forward accents sorting should be the default. Before this patch, backwards accent sorting was the default and all locales except fr_CA had to use define DIACRIT_FORWARD before copy "iso14651_t1" Most locales didn’t do that and thus got the inappropriate backwards accents sorting by accident. Now only the fr_CA locale needs to use define DIACRIT_BACKWARD before copy "iso14651_t1" Original patch slightly modified by: Mike FABIAN <mfabian@redhat.com> ----------------------------------------------------------------------- Summary of changes: ChangeLog | 17 +++++++++++++++++ localedata/Makefile | 4 ++-- localedata/{fr_FR.UTF-8.in => fr_CA.UTF-8.in} | 18 +++++++++--------- localedata/fr_FR.UTF-8.in | 22 +++++++++++----------- localedata/locales/cs_CZ | 2 -- localedata/locales/de_DE | 2 -- localedata/locales/fr_CA | 2 ++ localedata/locales/hu_HU | 1 - localedata/locales/iso14651_t1_common | 6 +++--- localedata/locales/lb_LU | 2 -- localedata/locales/yuw_PG | 1 - 11 files changed, 44 insertions(+), 33 deletions(-) copy localedata/{fr_FR.UTF-8.in => fr_CA.UTF-8.in} (100%)
Fixed in glibc master.
Well all french language locales should be diacrit backward. fr_FR, fr_BE, fr_CH and others. Also other languages, where french words and names are the biggest source of multiple accented characters should have diacrit backward. This goes for Danish (my own language), Swedish, Norwegian, Finnish, Dutch. Best regards keld On Wed, Nov 29, 2017 at 10:57:48AM +0000, cvs-commit at gcc dot gnu.org wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > > The only locale which currently needs backward accents sorting is fr_CA. > Therefore, forward accents sorting should be the default. > > Before this patch, backwards accent sorting was the default and all > locales except fr_CA had to use > > define DIACRIT_FORWARD > > before > > copy "iso14651_t1" > > Most locales didn???t do that and thus got the inappropriate backwards > accents sorting > by accident. Now only the fr_CA locale needs to use > > define DIACRIT_BACKWARD > > before > > copy "iso14651_t1" > > Original patch slightly modified by: Mike FABIAN <mfabian@redhat.com>
Probably also for all English language locales, and African language locales, where French influence is big, and the use of accented characters in the african language in question, eg Swahili, is very limited. Best regards keld On Wed, Nov 29, 2017 at 03:10:28PM +0200, Keld Simonsen wrote: > Well all french language locales should be diacrit backward. > fr_FR, fr_BE, fr_CH and others. > > Also other languages, where french words and names are the biggest source > of multiple accented characters should have diacrit backward. > This goes for Danish (my own language), Swedish, Norwegian, Finnish, Dutch. > > Best regards > keld > > On Wed, Nov 29, 2017 at 10:57:48AM +0000, cvs-commit at gcc dot gnu.org wrote: > > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > > > > > The only locale which currently needs backward accents sorting is fr_CA. > > Therefore, forward accents sorting should be the default. > > > > Before this patch, backwards accent sorting was the default and all > > locales except fr_CA had to use > > > > define DIACRIT_FORWARD > > > > before > > > > copy "iso14651_t1" > > > > Most locales didn???t do that and thus got the inappropriate backwards > > accents sorting > > by accident. Now only the fr_CA locale needs to use > > > > define DIACRIT_BACKWARD > > > > before > > > > copy "iso14651_t1" > > > > Original patch slightly modified by: Mike FABIAN <mfabian@redhat.com>
(In reply to keld@keldix.com from comment #16) > Also other languages, where french words and names are the biggest source > of multiple accented characters should have diacrit backward. > This goes for Danish (my own language), Swedish, Norwegian, Finnish, Dutch. I can't speak any of these languages, but looking at some random Finnish text I see tons of ä and ö letters, a significant amount of words containing 2 or more of them. Hence I seriously doubt the correctness of your claim. Even if looking only at the foreign words within these languages, I'd _guess_ that they take words from each other or maybe German more often than from French. But even if let's assume French is the most common source of foreign words, that's still not a strong enough reason to go for backwards diacrit ordering. In order for backwards diacrit ordering to even be a possibility to consider, I believe French accented words should outweigh all other local and foreign accented words combined. IMO let's keep this unreasonable idea of backwards diacrit ordering to those language only that explicitly have it, let's not force this stupid concept on more locales than necessaary. By the way, don't these language have some "official" collation rules, or at least some established common practice?
(In reply to Egmont Koblinger from comment #18) > By the way, don't these language have some "official" collation rules, or at > least some established common practice? I expect that many languages/scripts have multiple collation rules, depending on use, particularly when it comes to sorting foreign languages using the same base script.
(In reply to Florian Weimer from comment #19) > I expect that many languages/scripts have multiple collation rules, > depending on use, particularly when it comes to sorting foreign languages > using the same base script. Let's not forget that most languages with Latin scripts do use accents regularly. I don't think glibc allows different diacrit ordering for "own" accents and "foreign" accents, e.g. in case of Finnish to use forward diacrit ordering for ä and ö, and backward diacrit ordering for é and û (and what if they're mixed?). So the question is not how to sort _foreign_ words within the language, the question is how to sort _own_ words of the language. This defines the diacrit sorting. Foreign words will follow. If a list to be sorted is composed solely of foreign words from a particular language, e.g. solely French words in an otherwise Finnish environment, it might be reasonable to sort using the rules of that language, e.g. French in this case. This can be achieved by setting LC_COLLATE=fr_FR.UTF-8. In my opinion, the only valid question is what to do with English in territories where French is by far the second most popular language: is it reasonable to go with backward diacrits ordering there?
On Wed, Nov 29, 2017 at 07:27:32PM +0000, egmont at gmail dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > --- Comment #18 from Egmont Koblinger <egmont at gmail dot com> --- > (In reply to keld@keldix.com from comment #16) > > > Also other languages, where french words and names are the biggest source > > of multiple accented characters should have diacrit backward. > > This goes for Danish (my own language), Swedish, Norwegian, Finnish, Dutch. > > I can't speak any of these languages, but looking at some random Finnish text I > see tons of ä and ö letters, a significant amount of words containing 2 or more > of them. Hence I seriously doubt the correctness of your claim. Well, in Finnish and other Nordic languages like Danish, Swedish and Norwegian, ö and ä etc are not considered accented letters, but genuine separated letters, so that is why there are few strings with more than one accented letter. > Even if looking only at the foreign words within these languages, I'd _guess_ > that they take words from each other or maybe German more often than from > French. But even if let's assume French is the most common source of foreign > words, that's still not a strong enough reason to go for backwards diacrit > ordering. In order for backwards diacrit ordering to even be a possibility to > consider, I believe French accented words should outweigh all other local and > foreign accented words combined. German umlaut letters are much the same in Finnish (and Swedish) and ä and ö are then the same as the genuine Finnish/Swedish letters. Yes, I also think that the total number of French words with 2 or more accented letters (according to the rules of the specific language) should outweight the total number of other occurrances, But I believe that this is the case in the examples that I have given. > By the way, don't these language have some "official" collation rules, or at > least some established common practice? There are specs from the official standards bodies specifying the backwards diacrit rules, yes. Best regards keld
On Wed, Nov 29, 2017 at 07:49:36PM +0000, fweimer at redhat dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > --- Comment #19 from Florian Weimer <fweimer at redhat dot com> --- > (In reply to Egmont Koblinger from comment #18) > > > By the way, don't these language have some "official" collation rules, or at > > least some established common practice? > > I expect that many languages/scripts have multiple collation rules, depending > on use, particularly when it comes to sorting foreign languages using the same > base script. That is not my experience. For Danish (my language) there is only one standard, and that takes care of many foreign characters. Then there is a spec from Danish Standard that is more elaborate, in the form of a POSIX/Linux locale, covering all of ISO 10646/Unicode, that builds on ISO 14651, with the backwards diacrit spec. For German, I know there are 2 sorting specs, one where ä, ö and ü etc are considered accented versions of a, o and u, and one where they are interpreted as ae oe and ue. There are sorting standards for all of these languages, that are well adhered to in the market place. best regards keld
On Wed, Nov 29, 2017 at 08:14:33PM +0000, egmont at gmail dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > --- Comment #20 from Egmont Koblinger <egmont at gmail dot com> --- > (In reply to Florian Weimer from comment #19) > > > I expect that many languages/scripts have multiple collation rules, > > depending on use, particularly when it comes to sorting foreign languages > > using the same base script. > > Let's not forget that most languages with Latin scripts do use accents > regularly. I don't think glibc allows different diacrit ordering for "own" > accents and "foreign" accents, e.g. in case of Finnish to use forward diacrit > ordering for ä and ö, and backward diacrit ordering for é and û (and what if > they're mixed?). I agree that glibc does not distinguish between "own" accented letters, and foreign. Bot ä and ö are not accented letters in Finnish, they are genuine separate letters with their own place in the alphabeth. > In my opinion, the only valid question is what to do with English in > territories where French is by far the second most popular language: is it > reasonable to go with backward diacrits ordering there? That is what I am suggesting, at least for Canada. The same reasoning could be done for Dutch in Belgium, and then also the Netherlands. Best regards Keld
(In reply to keld@keldix.com from comment #21) > Well, in Finnish and other Nordic languages like Danish, Swedish and > Norwegian, ö and ä etc > are not considered accented letters, but genuine separated letters, so that > is why > there are few strings with more than one accented letter. Thanks for the explanation! (This actually should have occurred to me, as the famous Swedish yellow-blue furniture store offers framed pictures and bed linen showing the Swedish alphabet, with ÅÄÖ at the end.) To clarify: If they sort German words containing ä and ö, they're sorted among the same letters of their own language, right? And what about French accents, are they on the other hand mixed together with their unaccented counterparts? > German umlaut letters are much the same in Finnish (and Swedish) and ä and ö > are > then the same as the genuine Finnish/Swedish letters. What about German ü? (In reply to keld@keldix.com from comment #22) > [...] Then there is a spec from Danish Standard > that is more elaborate [...] with the backwards diacrit spec. I'm shocked to hear that there's not only one language but more languages that use backwards diacritics, something that IMO no sane man with any tiny bit of common sense would ever decide on :-) (In reply to keld@keldix.com from comment #23) > That is what I am suggesting, at least for Canada. > The same reasoning could be done for Dutch in Belgium, and then also the > Netherlands. If this is indeed what's correct for these languages / what people living there prefer then it's okay for me. I'm just hoping that the kinda de-facto standard en_US will stay with forward diacrits. I _guess_ Spanish is more frequently used there than French, plus again, I can't imagine how anyone ever could have come up with this braindamaged idea of backward diacrit sorting so I'd personally prefer en_US not to have this craziness :-) Cheers!
On Thu, Nov 30, 2017 at 09:09:25AM +0000, egmont at gmail dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=17750 > > --- Comment #24 from Egmont Koblinger <egmont at gmail dot com> --- > (In reply to keld@keldix.com from comment #21) > > > Well, in Finnish and other Nordic languages like Danish, Swedish and > > Norwegian, ö and ä etc > > are not considered accented letters, but genuine separated letters, so that > > is why > > there are few strings with more than one accented letter. > > To clarify: If they sort German words containing ä and ö, they're sorted among > the same letters of their own language, right? And what about French accents, > are they on the other hand mixed together with their unaccented counterparts? Yes, German ö and ä are treated exactly as the Swedish letters. And French accented letters like é and è are treated as 'e' but with an accent. é is actually much used in Swedish proper. > > German umlaut letters are much the same in Finnish (and Swedish) and ä and ö > > are > > then the same as the genuine Finnish/Swedish letters. > > What about German ü? ü is treated as an y AFAIK, but as with an accent. Danish æ and ø are treated as ä and ö but as if they have an accent. > (In reply to keld@keldix.com from comment #22) > > > [...] Then there is a spec from Danish Standard > > that is more elaborate [...] with the backwards diacrit spec. > > I'm shocked to hear that there's not only one language but more languages that > use backwards diacritics, something that IMO no sane man with any tiny bit of > common sense would ever decide on :-) Well, it is because the last accented character in French are more important when pronounciated. I agree the it is a bit coulter-intuitive, but I do favour the actual habits in the real world over what is logic. > (In reply to keld@keldix.com from comment #23) > > > That is what I am suggesting, at least for Canada. > > The same reasoning could be done for Dutch in Belgium, and then also the > > Netherlands. > > If this is indeed what's correct for these languages / what people living there > prefer then it's okay for me. I'm just hoping that the kinda de-facto standard > en_US will stay with forward diacrits. I _guess_ Spanish is more frequently > used there than French, plus again, I can't imagine how anyone ever could have > come up with this braindamaged idea of backward diacrit sorting so I'd > personally prefer en_US not to have this craziness :-) the kind of defacto i18n locale has forward diacrits. i18n is the standard locale of ISO TR 30112. I think both Spanish and German needs forward diacrits, and Spanish being a bigger language than French would give that we should use forward diacrit as the default. Best regards Keld
Created attachment 10659 [details] attachment-71592-0.html > > > > > If this is indeed what's correct for these languages / what people > living there > > prefer then it's okay for me. I'm just hoping that the kinda de-facto > standard > > en_US will stay with forward diacrits. I _guess_ Spanish is more > frequently > > used there than French, plus again, I can't imagine how anyone ever > could have > > come up with this braindamaged idea of backward diacrit sorting so I'd > > personally prefer en_US not to have this craziness :-) > > the kind of defacto i18n locale has forward diacrits. i18n is the standard > locale of ISO TR 30112. > I think both Spanish and German needs forward diacrits, and Spanish being > a bigger > language than French would give that we should use forward diacrit as the > default. > > Don't forget that the conquistadores brought Spanish orthograohy to many of the indigenous languages of the Americas as well, small in Intenet footprint and speaker count, large in language count. cjl