This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Improved check-localedef script


On 8/4/17, Mike FABIAN <mfabian@redhat.com> wrote:
> Zack Weinberg <zackw@panix.com> wrote:
>> localedata/locales/ur_PK... (charset: cp1256)
>>   localedata/locales/ur_PK:114: string not representable in cp1256:
>>       062C 0646 0648 0631 06CC
>>   localedata/locales/ur_PK:115: string not representable in cp1256:
>>       0641 0631 0648 0631 06CC
>>   localedata/locales/ur_PK:117: string not representable in cp1256:
>>       0627 067E 0631 06CC 0644
>>
>> These are the abmon strings, so I think it really would be a problem...
>
> This is the first abmon string:
>
>     abmon	"جنوری";/
>
> The last letter in this string, ی U+06CC ARABIC LETTER FARSI YEH
> is not convertible to CP1256.
>
> But this letter seems to be really used in writing Urdu, see:
>
>     https://en.wikipedia.org/wiki/Urdu_alphabet
>     https://en.wikipedia.org/wiki/Urdu_alphabet#Ye
>
> So I think CP1256 is not a suitable charset to use for Urdu.


Note that there is a transliteration rule for that letter:

translit_start
include "translit_combining";""

% those two lettes are not in cp1256...

% Maddah above -> Alef with madda above
<U0653> "<U0622>"
% Farsi yeh -> yeh
<U06CC> "<U064A>"

translit_end


>
>     https://en.wikipedia.org/wiki/Windows-1256
>
> says:
>
> Wikipedia> Windows-1256 is a code page used to write Arabic (and possibly
> some
>
> Note the “possibly”.
>
> Wikipedia> other languages that use Arabic script, like Persian and Urdu)
> under
> Wikipedia> Microsoft Windows.
> Wikipedia> [...]
> Wikipedia> Unicode and UTF-8 are preferred to Windows 1256 in modern
> Wikipedia> applications. 0.1% of all web pages use Windows-1256 in June
> 2016.
>
> So CP1256 doesn’t seem to be used much anymore.
>

Still, Xorg's locale.alias aliases ur_PK to ur_PK.CP1256:
https://cgit.freedesktop.org/xorg/lib/libX11/tree/nls/locale.alias.pre#n1121
, but that line comes straight from 2004:
https://cgit.freedesktop.org/xorg/lib/libX11/commit/nls/locale.alias.pre?id=c6349f43193b74a3c09945f3093a871b0157ba47


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]