Summary: | Support old DOS Lithuanian character sets in iconv | ||
---|---|---|---|
Product: | glibc | Reporter: | Rimas Kudelis <rimas> |
Component: | localedata | Assignee: | GNU C Library Locale Maintainers <libc-locales> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | drepper.fsp, glibc-bugs |
Priority: | P2 | Flags: | fweimer:
security-
|
Version: | unspecified | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: | ||
Attachments: |
Mapping of CP770
Mapping of CP771 Mapping of CP772 Mapping of CP773 Mapping of CP774 Mapping of CP773 (corrected) |
Description
Rimas Kudelis
2010-04-23 14:24:55 UTC
Then provide mapping tables. Created attachment 4762 [details]
Mapping of CP770
Created attachment 4763 [details]
Mapping of CP771
Created attachment 4764 [details]
Mapping of CP772
Created attachment 4765 [details]
Mapping of CP773
Created attachment 4766 [details]
Mapping of CP774
I've attached five files with mapping tables for each codepage. Their format is: [octal code]: [UTF-8 character] Lower 127 positions (0000-0177) match ASCII in all cases, so only the positions starting 0200 matter. It seems like these charsets are (or maybe were) supported by ICU (see [1]). The page also has some further descriptions that could be used when forming alias names for cp77x charsets: CP770 Lithuanian Standard RST 1095-89 CP771 KBL (Lithuanian and Russian characters) CP772 Lithuanian Standard LST 1284:1993 CP773 Lithuanian (Mix of 771 and 775) CP774 Lithuanian Standard 1283:1993 Unfortunately, I couldn't find source files of ICU mappings of these character sets at [2], so I can't attach them. Instead, I used a small program found at [3], developed a few years ago specifically to act as a converter among different character sets used in Lithuania (note: I changed one symbol in CP770.txt to match with the actual standard). If it's possible to find ICU mappings, I think most likely they should be used as a basis for conversion. Otherwise, the files attached should be fine. [1] http://publib.boulder.ibm.com/infocenter/tivihelp/v24r1/index.jsp?topic=/com.ibm.itcama.doc_6.2.3/itcam_oraclerac63200.htm [2] http://source.icu-project.org/repos/icu/data/trunk/charset/data/ [3] https://www3.mruni.lt/~rims/kodav/#Diegimas Created attachment 4767 [details] Mapping of CP773 (corrected) According to the name mentioned on ICU page, and cp773.acm (CP773 mapping for Linux console) found on [1], this is the more correct mapping of codepage 773. [1] http://gedmin.as/lit-con/ The files aren't usable in that form. It was quite a lot of work to make all the transformations. Support is in git now. Wow, thanks! I've actually got hold of the paper standards, at least some of them, so I should be able to check the validity of our mappings when time permits. |