Improvements in localedef to help writing LC_COLLATE sections

Denis Barbier
Sat Jan 22 21:44:00 GMT 2005


During the last weeks I had a close look at localedef to understand how
locales are compiled.  Then I tried to help with several bugs (like
collation with Dzongkha locale) and improvements (to replace current
collation rules by iso14651_t1 tailoring).

I filed several bugreports with patches, and believe that they should be
committed.  So it would be great if you could test, give comments and
tell me what should be done so that they get committed.

  + Be less strict with keyword ordering
    It is currently quite hard to add a new script, because keywords
    cannot be used anywhere, e.g. 'script' can only be used at the
    beginning of LC_COLLATE section.  When toggles are added (BZ686),
    new keywords appear before 'copy' which may also be a problem.
    In short, this patch removes all unneeded checks to make localedef
    more flexible.
  + Implement toggle switches in LC_COLLATE section
    ISO14652 defines toggles, which are similar to preprocessor
    directives.  These keywords were already defined, but not
    implemented.  This patch implements 'define', 'undef', 'ifdef',
    'else' and 'endif' keywords.  I tested this patch by adding few
    modifications to iso14651_t1:
      ifdef LATIN_FORWARD
      order_start <LATIN>;forward;forward;forward;forward,position
      order_start <LATIN>;forward;backward;forward;forward,position
    and an UPPERCASE_FIRST toggle to sort uppercase letters before
    lowercase.  With these two toggles, I can replace almost all
    existing collation rules by iso14651_t1 tailoring, which makes
    much more compact and easier to maintain LC_COLLATE sections.
    Of course, another bugreport will be filed when this boring work
    is finished.
  + Localedef does not respect rule definitions in LC_COLLATE
    This bugreport has examples to demonstrate that localedef is buggy,
    rulesets declaration do not always work as written in locale files.
  + Localedef fails with complex LC_COLLATE rules
    I provided a patch to fix this bug reported by Pablo so that more
    than 256 collating-element keywords can be used.

These patches do not modify the structure of compiled locales, which
means that locales generated by a patched localedef can be used by a
pristine glibc.


More information about the Libc-locales mailing list