[PATCH v3] Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]
Carlos O'Donell
carlos@redhat.com
Thu Jun 25 18:57:23 GMT 2020
On 6/25/20 9:07 AM, Mike FABIAN wrote:
> From 32ee1eb4037ce5c777322a5043a48b8627610d8d Mon Sep 17 00:00:00 2001
> From: Mike FABIAN <mfabian@redhat.com>
> Date: Tue, 16 Jun 2020 08:29:40 +0200
> Subject: [PATCH] Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to
> UD7FB to 0 [BZ #26120]
OK for master.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> ---
> localedata/charmaps/UTF-8 | 2 ++
> localedata/locales/i18n_ctype | 2 +-
> localedata/locales/tr_TR | 2 +-
> localedata/locales/translit_circle | 2 +-
> localedata/locales/translit_cjk_compat | 2 +-
> localedata/locales/translit_combining | 2 +-
> localedata/locales/translit_compat | 2 +-
> localedata/locales/translit_font | 2 +-
> localedata/locales/translit_fraction | 2 +-
> localedata/unicode-gen/utf8_gen.py | 9 ++++++++-
> 10 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/localedata/charmaps/UTF-8 b/localedata/charmaps/UTF-8
> index 14c5d4fa33..8cce47cd97 100644
> --- a/localedata/charmaps/UTF-8
> +++ b/localedata/charmaps/UTF-8
> @@ -48920,6 +48920,8 @@ WIDTH
> <UABE8> 0
> <UABED> 0
> <UAC00>...<UD7A3> 2
> +<UD7B0>...<UD7C6> 0
> +<UD7CB>...<UD7FB> 0
OK. Expected output by generator.
> <UF900>...<UFA6D> 2
> <UFA70>...<UFAD9> 2
> <UFB1E> 0
> diff --git a/localedata/locales/i18n_ctype b/localedata/locales/i18n_ctype
> index 6f078a101d..c63e0790fc 100644
> --- a/localedata/locales/i18n_ctype
> +++ b/localedata/locales/i18n_ctype
> @@ -26,7 +26,7 @@ fax ""
> language ""
> territory "Earth"
> revision "13.0.0"
> -date "2020-04-14"
> +date "2020-06-25"
OK. No change. Expected.
> category "i18n:2012";LC_CTYPE
> END LC_IDENTIFICATION
>
> diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
> index d5785ceca1..7dbb923228 100644
> --- a/localedata/locales/tr_TR
> +++ b/localedata/locales/tr_TR
> @@ -43,7 +43,7 @@ fax ""
> language "Turkish"
> territory "Turkey"
> revision "1.0"
> -date "2020-04-14"
> +date "2020-06-25"
OK. No change. Expected.
>
> category "i18n:2012";LC_IDENTIFICATION
> category "i18n:2012";LC_CTYPE
> diff --git a/localedata/locales/translit_circle b/localedata/locales/translit_circle
> index 0f1e81541c..5c07b44532 100644
> --- a/localedata/locales/translit_circle
> +++ b/localedata/locales/translit_circle
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of encircled characters.
> -% Generated automatically from UnicodeData.txt by gen_translit_circle.py on 2020-04-14 for Unicode 13.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_circle.py on 2020-06-25 for Unicode 13.0.0.
OK. No change. Expected.
>
> LC_CTYPE
>
> diff --git a/localedata/locales/translit_cjk_compat b/localedata/locales/translit_cjk_compat
> index 17b74134fc..ee0d7f83c6 100644
> --- a/localedata/locales/translit_cjk_compat
> +++ b/localedata/locales/translit_cjk_compat
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of CJK compatibility characters.
> -% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py on 2020-04-14 for Unicode 13.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py on 2020-06-25 for Unicode 13.0.0.
OK. No change. Expected.
>
> LC_CTYPE
>
> diff --git a/localedata/locales/translit_combining b/localedata/locales/translit_combining
> index d5c8bbfe8f..36128f097a 100644
> --- a/localedata/locales/translit_combining
> +++ b/localedata/locales/translit_combining
> @@ -10,7 +10,7 @@ comment_char %
>
> % Transliterations that remove all combining characters (accents,
> % pronounciation marks, etc.).
> -% Generated automatically from UnicodeData.txt by gen_translit_combining.py on 2020-04-14 for Unicode 13.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_combining.py on 2020-06-25 for Unicode 13.0.0.
OK. No change. Expected.
>
> LC_CTYPE
>
> diff --git a/localedata/locales/translit_compat b/localedata/locales/translit_compat
> index ff18b02ea3..ac24c4e938 100644
> --- a/localedata/locales/translit_compat
> +++ b/localedata/locales/translit_compat
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of compatibility characters and ligatures.
> -% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2020-04-14 for Unicode 13.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2020-06-25 for Unicode 13.0.0.
OK. No change. Expected.
>
> LC_CTYPE
>
> diff --git a/localedata/locales/translit_font b/localedata/locales/translit_font
> index e79b0d83f5..680c4ed426 100644
> --- a/localedata/locales/translit_font
> +++ b/localedata/locales/translit_font
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of font equivalents.
> -% Generated automatically from UnicodeData.txt by gen_translit_font.py on 2020-04-14 for Unicode 13.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_font.py on 2020-06-25 for Unicode 13.0.0.
OK. No change. Expected.
>
> LC_CTYPE
>
> diff --git a/localedata/locales/translit_fraction b/localedata/locales/translit_fraction
> index 197d57a644..b52244969e 100644
> --- a/localedata/locales/translit_fraction
> +++ b/localedata/locales/translit_fraction
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of fractions.
> -% Generated automatically from UnicodeData.txt by gen_translit_fraction.py on 2020-04-14 for Unicode 13.0.0.> +% Generated automatically from UnicodeData.txt by gen_translit_fraction.py on 2020-06-25 for Unicode 13.0.0.
OK. No change. Expected.
> % The replacements have been surrounded with spaces, because fractions are
> % often preceded by a decimal number and followed by a unit or a math symbol.
>
> diff --git a/localedata/unicode-gen/utf8_gen.py b/localedata/unicode-gen/utf8_gen.py
> index 17b99ee88d..11c906b92f 100755
> --- a/localedata/unicode-gen/utf8_gen.py
> +++ b/localedata/unicode-gen/utf8_gen.py
> @@ -258,7 +258,13 @@ def process_width(outfile, ulines, elines, plines):
> if key in width_dict:
> del width_dict[key] # default width is 1
> for key in list(range(0x1160, 0x1200)):
> - width_dict[key] = 0
> + # Hangul jungseong and jongseong:
> + if key in unicode_utils.UNICODE_ATTRIBUTES:
> + width_dict[key] = 0
> + for key in list(range(0xD7B0, 0xD800)):
> + # Hangul jungseong and jongseong:
> + if key in unicode_utils.UNICODE_ATTRIBUTES:
> + width_dict[key] = 0
OK. Expected per bug.
> for key in list(range(0x3248, 0x3250)):
> # These are “A” which means we can decide whether to treat them
> # as “W” or “N” based on context:
> @@ -327,6 +333,7 @@ if __name__ == "__main__":
> help='The Unicode version of the input files used.')
> ARGS = PARSER.parse_args()
>
> + unicode_utils.fill_attributes(ARGS.unicode_data_file)
OK.
> with open(ARGS.unicode_data_file, mode='r') as UNIDATA_FILE:
> UNICODE_DATA_LINES = UNIDATA_FILE.readlines()
> with open(ARGS.east_asian_with_file, mode='r') as EAST_ASIAN_WIDTH_FILE:
> -- 2.26.2
--
Cheers,
Carlos.
More information about the Libc-alpha
mailing list