This is the mail archive of the
libc-locales@sourceware.org
mailing list for the GNU libc locales project.
[Bug localedata/14094] Update locale data to Unicode 7.0.0
- From: "maiku.fabian at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: libc-locales at sourceware dot org
- Date: Tue, 14 Oct 2014 08:07:13 +0000
- Subject: [Bug localedata/14094] Update locale data to Unicode 7.0.0
- Auto-submitted: auto-generated
- References: <bug-14094-716 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=14094
--- Comment #18 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Pravin S from comment #14)
> Created attachment 7715 [details]
> Patch to update UTF-8 CHARMAP and WIDTH to unicode 7.0
>
> Done with all work with UTF-8 file.
> Added two script:
> 1. utf8-gen.py to generate UTF-8 file
> 2. utf8-compatibility.py : to check backward compatibility of newly
> generated UTF-8 file
> 3. Report of new UTF-8 file backward compatibility is available AT
> https://raw.githubusercontent.com/pravins/glibc-i18n/master/report-utf8
>
> Submitting to glibc-alpha, please help to quick review and push to git.
I checked the scripts Pravin used and the resulting UTF-8 file.
I found only one minor problem:
In some cases, both UnicodeData.txt and EastAsianWidth.txt have information
about width. For example, EastAsianWidth.txt has:
302A..302D;W # Mn [4] IDEOGRAPHIC LEVEL TONE MARK..IDEOGRAPHIC
ENTERING TONE MARK
which gives us width 2 for these 4 characters (because of âWâ) but
UnicodeData.txt has:
302A;IDEOGRAPHIC LEVEL TONE MARK;Mn;218;NSM;;;;;N;;;;;
302B;IDEOGRAPHIC RISING TONE MARK;Mn;228;NSM;;;;;N;;;;;
302C;IDEOGRAPHIC DEPARTING TONE MARK;Mn;232;NSM;;;;;N;;;;;
302D;IDEOGRAPHIC ENTERING TONE MARK;Mn;222;NSM;;;;;N;;;;;
which would give width 0 (because of âNSMâ).
I changed Pravinâs script a bit to prefer the information from
EastAsianWidth.txt in case of conflicts.
Pravin has already merged my change into his git repository.
--
You are receiving this mail because:
You are on the CC list for the bug.