Back in DOS days, a few different character sets were more or less used in Lithuania: 770, 771, 772, 773, 774, and 775. Iconv currently only supports the latter. It would be nice to get support for others on the list too. Links: http://www.likit.lt/nostyle/770.htm http://www.likit.lt/nostyle/771.htm http://www.likit.lt/nostyle/772.htm <no illustration for cp773> http://www.likit.lt/nostyle/774.htm If adding these character sets to iconv is generally acceptable, I think I could try to generate mappings from all these charsets to UTF-8
Then provide mapping tables.
Created attachment 4762 [details] Mapping of CP770
Created attachment 4763 [details] Mapping of CP771
Created attachment 4764 [details] Mapping of CP772
Created attachment 4765 [details] Mapping of CP773
Created attachment 4766 [details] Mapping of CP774
I've attached five files with mapping tables for each codepage. Their format is: [octal code]: [UTF-8 character] Lower 127 positions (0000-0177) match ASCII in all cases, so only the positions starting 0200 matter. It seems like these charsets are (or maybe were) supported by ICU (see [1]). The page also has some further descriptions that could be used when forming alias names for cp77x charsets: CP770 Lithuanian Standard RST 1095-89 CP771 KBL (Lithuanian and Russian characters) CP772 Lithuanian Standard LST 1284:1993 CP773 Lithuanian (Mix of 771 and 775) CP774 Lithuanian Standard 1283:1993 Unfortunately, I couldn't find source files of ICU mappings of these character sets at [2], so I can't attach them. Instead, I used a small program found at [3], developed a few years ago specifically to act as a converter among different character sets used in Lithuania (note: I changed one symbol in CP770.txt to match with the actual standard). If it's possible to find ICU mappings, I think most likely they should be used as a basis for conversion. Otherwise, the files attached should be fine. [1] http://publib.boulder.ibm.com/infocenter/tivihelp/v24r1/index.jsp?topic=/com.ibm.itcama.doc_6.2.3/itcam_oraclerac63200.htm [2] http://source.icu-project.org/repos/icu/data/trunk/charset/data/ [3] https://www3.mruni.lt/~rims/kodav/#Diegimas
Created attachment 4767 [details] Mapping of CP773 (corrected) According to the name mentioned on ICU page, and cp773.acm (CP773 mapping for Linux console) found on [1], this is the more correct mapping of codepage 773. [1] http://gedmin.as/lit-con/
The files aren't usable in that form. It was quite a lot of work to make all the transformations. Support is in git now.
Wow, thanks! I've actually got hold of the paper standards, at least some of them, so I should be able to check the validity of our mappings when time permits.