This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Unicode 10.0.0 Support: Update character encoding, character type info, and transliteration tables
- From: Carlos O'Donell <carlos at redhat dot com>
- To: Mike FABIAN <mfabian at redhat dot com>, libc-alpha at sourceware dot org
- Date: Wed, 21 Jun 2017 11:22:19 -0400
- Subject: Re: [PATCH] Unicode 10.0.0 Support: Update character encoding, character type info, and transliteration tables
- Authentication-results: sourceware.org; auth=none
- References: <s9d60fq25c4.fsf@redhat.com>
On 06/20/2017 08:47 AM, Mike FABIAN wrote:
> Bug 21533: Update to Unicode 10.0.0
>
> * Unicode 10.0.0 Support: Character encoding, character type info, and
> transliteration tables are all updated to Unicode 10.0.0, using
> generator scripts contributed by Mike FABIAN (Red Hat).
Please correct the following and commit:
(a) glibc 2.26 is under development, please move your news entry to
Version 2.26, and merge it with the Unicode 9.0.0 entry.
See:
https://sourceware.org/glibc/wiki/Release/
https://sourceware.org/glibc/wiki/Glibc%20Timeline
(b) Set i18n date to 2017-06-20 to reflect Unicode 10 release date.
Thank you very much for working on this update!
> -- Mike FABIAN <mfabian@redhat.com>
>
>
> Bug-21533-Update-to-Unicode-10.0.0.patch
>
>
> From ee2358792267631de954443e8ae89aabf975e5ad Mon Sep 17 00:00:00 2001
> From: Mike FABIAN <mfabian@redhat.com>
> Date: Wed, 31 May 2017 11:10:25 +0200
> Subject: [PATCH] Bug 21533: Update to Unicode 10.0.0
>
> * Unicode 10.0.0 Support: Character encoding, character type info, and
> transliteration tables are all updated to Unicode 10.0.0, using
> generator scripts contributed by Mike FABIAN (Red Hat).
> ---
> ChangeLog | 6 +
> NEWS | 16 +
> include/stdc-predef.h | 13 +-
> localedata/charmaps/UTF-8 | 1242 +++++++++++++++++++++-
> localedata/locales/i18n | 1012 +++++++++---------
> localedata/locales/tr_TR | 1012 +++++++++---------
> localedata/locales/translit_circle | 2 +-
> localedata/locales/translit_cjk_compat | 2 +-
> localedata/locales/translit_combining | 142 ++-
> localedata/locales/translit_compat | 2 +-
> localedata/locales/translit_font | 2 +-
> localedata/locales/translit_fraction | 2 +-
> localedata/unicode-gen/DerivedCoreProperties.txt | 328 ++++--
> localedata/unicode-gen/EastAsianWidth.txt | 90 +-
> localedata/unicode-gen/Makefile | 2 +-
> localedata/unicode-gen/UnicodeData.txt | 1028 +++++++++++++++++-
> 16 files changed, 3805 insertions(+), 1096 deletions(-)
>
> diff --git a/ChangeLog b/ChangeLog
> index 2dce96bc1e..8d8f1e2811 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,9 @@
> +2017-06-20 Mike FABIAN <mfabian@redhat.com>
> +
> + [BZ #21533]
> + * include/stdc-predef.h (__STDC_ISO_10646__): Update to
> + 201706L for Unicode 10.0.
OK.
> +
> 2017-06-19 Joseph Myers <joseph@codesourcery.com>
>
> * sysdeps/mips/atomic-machine.h (R10K_BEQZ_INSN): Remove.
> diff --git a/NEWS b/NEWS
> index f81d02f1cb..15c03f10b5 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -5,6 +5,22 @@ See the end for copying conditions.
> Please send GNU C library bug reports via <http://sourceware.org/bugzilla/>
> using `glibc' in the "product" field.
>
> +Version 2.27
> +
> +* Unicode 10.0.0 Support: Character encoding, character type info, and
> + transliteration tables are all updated to Unicode 10.0.0, using
> + generator scripts contributed by Mike FABIAN (Red Hat).
> +
> +Security related changes:
> +
> + [Add security related changes here]
> +
> +The following bugs are resolved with this release:
> +
> + [The release manager will add the list generated by
> + scripts/list-fixed-bugs.py just before the release.]
> +
> +
> Version 2.26
>
> * Unicode 9.0.0 Support: Character encoding, character type info, and
> diff --git a/include/stdc-predef.h b/include/stdc-predef.h
> index a2e148cd2f..74ade90779 100644
> --- a/include/stdc-predef.h
> +++ b/include/stdc-predef.h
> @@ -49,12 +49,13 @@
> # define __STDC_IEC_559_COMPLEX__ 1
> #endif
>
> -/* wchar_t uses Unicode 9.0.0. Version 9.0 of the Unicode Standard is
> - synchronized with ISO/IEC 10646:2014, fourth edition, plus
> - Amd. 1 and Amd. 2 and 273 characters from forthcoming 10646, fifth edition.
> - (Amd. 2 was published 2016-05-01,
> - see https://www.iso.org/obp/ui/#iso:std:iso-iec:10646:ed-4:v1:amd:2:v1:en) */
> -#define __STDC_ISO_10646__ 201605L
> +/* wchar_t uses Unicode 10.0.0. Version 10.0 of the Unicode Standard is
> + synchronized with ISO/IEC 10646:2017, fifth edition, plus
> + the following additions from Amendment 1 to the fifth edition:
> + - 56 emoji characters
> + - 285 hentaigana
> + - 3 additional Zanabazar Square characters */
> +#define __STDC_ISO_10646__ 201706L
OK.
>
> /* We do not support C11 <threads.h>. */
> #define __STDC_NO_THREADS__ 1
> diff --git a/localedata/charmaps/UTF-8 b/localedata/charmaps/UTF-8
> index 8463d58c19..42f7d4eaec 100644
> --- a/localedata/charmaps/UTF-8
> +++ b/localedata/charmaps/UTF-8
> @@ -2081,6 +2081,17 @@ CHARMAP
...
OK.
> diff --git a/localedata/locales/i18n b/localedata/locales/i18n
> index c063838fd9..d580152790 100644
> --- a/localedata/locales/i18n
> +++ b/localedata/locales/i18n
> @@ -19,7 +19,7 @@ fax ""
> language ""
> territory ""
> revision ""
> -date "2016-06-21"
> +date "2017-06-01"
Shouldn't this be 2017-06-20 to reflect the Unicode 10 release date?
> category "i18n:2012";LC_IDENTIFICATION
> category "i18n:2012";LC_CTYPE
> @@ -36,7 +36,7 @@ END LC_IDENTIFICATION
>
> LC_CTYPE
> % The following is the 14652 i18n fdcc-set LC_CTYPE category.
> -% It covers Unicode version 9.0.0.
> +% It covers Unicode version 10.0.0.
OK. Yay! :-)
> % The character classes and mapping tables were automatically
> % generated using the gen_unicode_ctype.py program.
>
> diff --git a/localedata/locales/translit_circle b/localedata/locales/translit_circle
> index e3f61546fd..bcf1e8eeef 100644
> --- a/localedata/locales/translit_circle
> +++ b/localedata/locales/translit_circle
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of encircled characters.
> -% Generated automatically from UnicodeData.txt by gen_translit_circle.py on 2016-06-29 for Unicode 9.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_circle.py on 2017-06-01 for Unicode 10.0.0.
OK.
>
> LC_CTYPE
>
> diff --git a/localedata/locales/translit_cjk_compat b/localedata/locales/translit_cjk_compat
> index 286f409775..cf8307decb 100644
> --- a/localedata/locales/translit_cjk_compat
> +++ b/localedata/locales/translit_cjk_compat
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of CJK compatibility characters.
> -% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py on 2016-06-29 for Unicode 9.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py on 2017-06-01 for Unicode 10.0.0.
OK.
>
> LC_CTYPE
>
> diff --git a/localedata/locales/translit_combining b/localedata/locales/translit_combining
> index 6c5719752a..bf801abcdd 100644
> --- a/localedata/locales/translit_combining
> +++ b/localedata/locales/translit_combining
> @@ -10,7 +10,7 @@ comment_char %
>
> % Transliterations that remove all combining characters (accents,
> % pronounciation marks, etc.).
> -% Generated automatically from UnicodeData.txt by gen_translit_combining.py on 2016-06-29 for Unicode 9.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_combining.py on 2017-06-01 for Unicode 10.0.0.
OK.
>
> LC_CTYPE
>
> @@ -670,6 +670,14 @@ translit_start
> <U1DF4> ""
> % COMBINING UP TACK ABOVE
> <U1DF5> ""
> +% COMBINING KAVYKA ABOVE RIGHT
> +<U1DF6> ""
> +% COMBINING KAVYKA ABOVE LEFT
> +<U1DF7> ""
> +% COMBINING DOT ABOVE LEFT
> +<U1DF8> ""
> +% COMBINING WIDE INVERTED BRIDGE BELOW
> +<U1DF9> ""
OK.
> % COMBINING DELETION MARK
> <U1DFB> ""
> % COMBINING DOUBLE INVERTED BREVE BELOW
> @@ -828,6 +836,104 @@ translit_start
> <U00011445> ""
> % NEWA SIGN NUKTA
> <U00011446> ""
> +% ZANABAZAR SQUARE VOWEL SIGN I
> +<U00011A01> ""
> +% ZANABAZAR SQUARE VOWEL SIGN UE
> +<U00011A02> ""
> +% ZANABAZAR SQUARE VOWEL SIGN U
> +<U00011A03> ""
> +% ZANABAZAR SQUARE VOWEL SIGN E
> +<U00011A04> ""
> +% ZANABAZAR SQUARE VOWEL SIGN OE
> +<U00011A05> ""
> +% ZANABAZAR SQUARE VOWEL SIGN O
> +<U00011A06> ""
> +% ZANABAZAR SQUARE VOWEL SIGN AI
> +<U00011A07> ""
> +% ZANABAZAR SQUARE VOWEL SIGN AU
> +<U00011A08> ""
> +% ZANABAZAR SQUARE VOWEL SIGN REVERSED I
> +<U00011A09> ""
> +% ZANABAZAR SQUARE VOWEL LENGTH MARK
> +<U00011A0A> ""
> +% ZANABAZAR SQUARE FINAL CONSONANT MARK
> +<U00011A33> ""
> +% ZANABAZAR SQUARE SIGN VIRAMA
> +<U00011A34> ""
> +% ZANABAZAR SQUARE SIGN CANDRABINDU
> +<U00011A35> ""
> +% ZANABAZAR SQUARE SIGN CANDRABINDU WITH ORNAMENT
> +<U00011A36> ""
> +% ZANABAZAR SQUARE SIGN CANDRA WITH ORNAMENT
> +<U00011A37> ""
> +% ZANABAZAR SQUARE SIGN ANUSVARA
> +<U00011A38> ""
> +% ZANABAZAR SQUARE SIGN VISARGA
> +<U00011A39> ""
> +% ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA
> +<U00011A3B> ""
> +% ZANABAZAR SQUARE CLUSTER-FINAL LETTER RA
> +<U00011A3C> ""
> +% ZANABAZAR SQUARE CLUSTER-FINAL LETTER LA
> +<U00011A3D> ""
> +% ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA
> +<U00011A3E> ""
> +% ZANABAZAR SQUARE SUBJOINER
> +<U00011A47> ""
> +% SOYOMBO VOWEL SIGN I
> +<U00011A51> ""
> +% SOYOMBO VOWEL SIGN UE
> +<U00011A52> ""
> +% SOYOMBO VOWEL SIGN U
> +<U00011A53> ""
> +% SOYOMBO VOWEL SIGN E
> +<U00011A54> ""
> +% SOYOMBO VOWEL SIGN O
> +<U00011A55> ""
> +% SOYOMBO VOWEL SIGN OE
> +<U00011A56> ""
> +% SOYOMBO VOWEL SIGN AI
> +<U00011A57> ""
> +% SOYOMBO VOWEL SIGN AU
> +<U00011A58> ""
> +% SOYOMBO VOWEL SIGN VOCALIC R
> +<U00011A59> ""
> +% SOYOMBO VOWEL SIGN VOCALIC L
> +<U00011A5A> ""
> +% SOYOMBO VOWEL LENGTH MARK
> +<U00011A5B> ""
> +% SOYOMBO FINAL CONSONANT SIGN G
> +<U00011A8A> ""
> +% SOYOMBO FINAL CONSONANT SIGN K
> +<U00011A8B> ""
> +% SOYOMBO FINAL CONSONANT SIGN NG
> +<U00011A8C> ""
> +% SOYOMBO FINAL CONSONANT SIGN D
> +<U00011A8D> ""
> +% SOYOMBO FINAL CONSONANT SIGN N
> +<U00011A8E> ""
> +% SOYOMBO FINAL CONSONANT SIGN B
> +<U00011A8F> ""
> +% SOYOMBO FINAL CONSONANT SIGN M
> +<U00011A90> ""
> +% SOYOMBO FINAL CONSONANT SIGN R
> +<U00011A91> ""
> +% SOYOMBO FINAL CONSONANT SIGN L
> +<U00011A92> ""
> +% SOYOMBO FINAL CONSONANT SIGN SH
> +<U00011A93> ""
> +% SOYOMBO FINAL CONSONANT SIGN S
> +<U00011A94> ""
> +% SOYOMBO FINAL CONSONANT SIGN -A
> +<U00011A95> ""
> +% SOYOMBO SIGN ANUSVARA
> +<U00011A96> ""
> +% SOYOMBO SIGN VISARGA
> +<U00011A97> ""
> +% SOYOMBO GEMINATION MARK
> +<U00011A98> ""
> +% SOYOMBO SUBJOINER
> +<U00011A99> ""
OK.
> % BHAIKSUKI VOWEL SIGN AA
> <U00011C2F> ""
> % BHAIKSUKI VOWEL SIGN I
> @@ -932,6 +1038,40 @@ translit_start
> <U00011CB5> ""
> % MARCHEN SIGN CANDRABINDU
> <U00011CB6> ""
> +% MASARAM GONDI VOWEL SIGN AA
> +<U00011D31> ""
> +% MASARAM GONDI VOWEL SIGN I
> +<U00011D32> ""
> +% MASARAM GONDI VOWEL SIGN II
> +<U00011D33> ""
> +% MASARAM GONDI VOWEL SIGN U
> +<U00011D34> ""
> +% MASARAM GONDI VOWEL SIGN UU
> +<U00011D35> ""
> +% MASARAM GONDI VOWEL SIGN VOCALIC R
> +<U00011D36> ""
> +% MASARAM GONDI VOWEL SIGN E
> +<U00011D3A> ""
> +% MASARAM GONDI VOWEL SIGN AI
> +<U00011D3C> ""
> +% MASARAM GONDI VOWEL SIGN O
> +<U00011D3D> ""
> +% MASARAM GONDI VOWEL SIGN AU
> +<U00011D3F> ""
> +% MASARAM GONDI SIGN ANUSVARA
> +<U00011D40> ""
> +% MASARAM GONDI SIGN VISARGA
> +<U00011D41> ""
> +% MASARAM GONDI SIGN NUKTA
> +<U00011D42> ""
> +% MASARAM GONDI SIGN CANDRA
> +<U00011D43> ""
> +% MASARAM GONDI SIGN HALANTA
> +<U00011D44> ""
> +% MASARAM GONDI VIRAMA
> +<U00011D45> ""
> +% MASARAM GONDI RA-KARA
> +<U00011D47> ""
OK.
> % COMBINING GREEK MUSICAL TRISEME
> <U0001D242> ""
> % COMBINING GREEK MUSICAL TETRASEME
> diff --git a/localedata/locales/translit_compat b/localedata/locales/translit_compat
> index 4b5154dd48..a5668168a5 100644
> --- a/localedata/locales/translit_compat
> +++ b/localedata/locales/translit_compat
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of compatibility characters and ligatures.
> -% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2016-06-29 for Unicode 9.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2017-06-01 for Unicode 10.0.0.
>
OK.
> LC_CTYPE
>
> diff --git a/localedata/locales/translit_font b/localedata/locales/translit_font
> index b4bc2879a8..faffd6aaa1 100644
> --- a/localedata/locales/translit_font
> +++ b/localedata/locales/translit_font
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of font equivalents.
> -% Generated automatically from UnicodeData.txt by gen_translit_font.py on 2016-06-29 for Unicode 9.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_font.py on 2017-06-01 for Unicode 10.0.0.
>
OK.
> LC_CTYPE
>
> diff --git a/localedata/locales/translit_fraction b/localedata/locales/translit_fraction
> index de258d18c4..ff6d0e22a7 100644
> --- a/localedata/locales/translit_fraction
> +++ b/localedata/locales/translit_fraction
> @@ -9,7 +9,7 @@ comment_char %
> % otherwise be governed by that license.
>
> % Transliterations of fractions.
> -% Generated automatically from UnicodeData.txt by gen_translit_fraction.py on 2016-06-29 for Unicode 9.0.0.
> +% Generated automatically from UnicodeData.txt by gen_translit_fraction.py on 2017-06-01 for Unicode 10.0.0.
OK.
> % The replacements have been surrounded with spaces, because fractions are
> % often preceded by a decimal number and followed by a unit or a math symbol.
>
> diff --git a/localedata/unicode-gen/DerivedCoreProperties.txt b/localedata/unicode-gen/DerivedCoreProperties.txt
> index 0db031db01..16cd9b88bf 100644
> --- a/localedata/unicode-gen/DerivedCoreProperties.txt
> +++ b/localedata/unicode-gen/DerivedCoreProperties.txt
> @@ -1,6 +1,6 @@
> -# DerivedCoreProperties-9.0.0.txt
> -# Date: 2016-06-01, 10:34:24 GMT
> -# © 2016 Unicode®, Inc.
> +# DerivedCoreProperties-10.0.0.txt
> +# Date: 2017-03-19, 00:05:15 GMT
> +# © 2017 Unicode®, Inc.
OK.
> # Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
> # For terms of use, see http://www.unicode.org/terms_of_use.html
> #
> @@ -340,6 +340,7 @@ FFE9..FFEC ; Math # Sm [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS A
> 0828 ; Alphabetic # Lm SAMARITAN MODIFIER LETTER I
> 0829..082C ; Alphabetic # Mn [4] SAMARITAN VOWEL SIGN LONG I..SAMARITAN VOWEL SIGN SUKUN
> 0840..0858 ; Alphabetic # Lo [25] MANDAIC LETTER HALQA..MANDAIC LETTER AIN
> +0860..086A ; Alphabetic # Lo [11] SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MALAYALAM SSA
> 08A0..08B4 ; Alphabetic # Lo [21] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER KAF WITH DOT BELOW
> 08B6..08BD ; Alphabetic # Lo [8] ARABIC LETTER BEH WITH SMALL MEEM ABOVE..ARABIC LETTER AFRICAN NOON
> 08D4..08DF ; Alphabetic # Mn [12] ARABIC SMALL HIGH WORD AR-RUB..ARABIC SMALL HIGH WORD WAQFA
> @@ -379,6 +380,7 @@ FFE9..FFEC ; Math # Sm [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS A
> 09DF..09E1 ; Alphabetic # Lo [3] BENGALI LETTER YYA..BENGALI LETTER VOCALIC LL
> 09E2..09E3 ; Alphabetic # Mn [2] BENGALI VOWEL SIGN VOCALIC L..BENGALI VOWEL SIGN VOCALIC LL
> 09F0..09F1 ; Alphabetic # Lo [2] BENGALI LETTER RA WITH MIDDLE DIAGONAL..BENGALI LETTER RA WITH LOWER DIAGONAL
> +09FC ; Alphabetic # Lo BENGALI LETTER VEDIC ANUSVARA
> 0A01..0A02 ; Alphabetic # Mn [2] GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN BINDI
> 0A03 ; Alphabetic # Mc GURMUKHI SIGN VISARGA
> 0A05..0A0A ; Alphabetic # Lo [6] GURMUKHI LETTER A..GURMUKHI LETTER UU
> @@ -416,6 +418,7 @@ FFE9..FFEC ; Math # Sm [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS A
> 0AE0..0AE1 ; Alphabetic # Lo [2] GUJARATI LETTER VOCALIC RR..GUJARATI LETTER VOCALIC LL
> 0AE2..0AE3 ; Alphabetic # Mn [2] GUJARATI VOWEL SIGN VOCALIC L..GUJARATI VOWEL SIGN VOCALIC LL
> 0AF9 ; Alphabetic # Lo GUJARATI LETTER ZHA
> +0AFA..0AFC ; Alphabetic # Mn [3] GUJARATI SIGN SUKUN..GUJARATI SIGN MADDAH
> 0B01 ; Alphabetic # Mn ORIYA SIGN CANDRABINDU
> 0B02..0B03 ; Alphabetic # Mc [2] ORIYA SIGN ANUSVARA..ORIYA SIGN VISARGA
> 0B05..0B0C ; Alphabetic # Lo [8] ORIYA LETTER A..ORIYA LETTER VOCALIC L
> @@ -491,7 +494,7 @@ FFE9..FFEC ; Math # Sm [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS A
> 0CE0..0CE1 ; Alphabetic # Lo [2] KANNADA LETTER VOCALIC RR..KANNADA LETTER VOCALIC LL
> 0CE2..0CE3 ; Alphabetic # Mn [2] KANNADA VOWEL SIGN VOCALIC L..KANNADA VOWEL SIGN VOCALIC LL
> 0CF1..0CF2 ; Alphabetic # Lo [2] KANNADA SIGN JIHVAMULIYA..KANNADA SIGN UPADHMANIYA
> -0D01 ; Alphabetic # Mn MALAYALAM SIGN CANDRABINDU
> +0D00..0D01 ; Alphabetic # Mn [2] MALAYALAM SIGN COMBINING ANUSVARA ABOVE..MALAYALAM SIGN CANDRABINDU
> 0D02..0D03 ; Alphabetic # Mc [2] MALAYALAM SIGN ANUSVARA..MALAYALAM SIGN VISARGA
> 0D05..0D0C ; Alphabetic # Lo [8] MALAYALAM LETTER A..MALAYALAM LETTER VOCALIC L
> 0D0E..0D10 ; Alphabetic # Lo [3] MALAYALAM LETTER E..MALAYALAM LETTER AI
> @@ -792,12 +795,12 @@ FFE9..FFEC ; Math # Sm [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS A
> 30A1..30FA ; Alphabetic # Lo [90] KATAKANA LETTER SMALL A..KATAKANA LETTER VO
> 30FC..30FE ; Alphabetic # Lm [3] KATAKANA-HIRAGANA PROLONGED SOUND MARK..KATAKANA VOICED ITERATION MARK
> 30FF ; Alphabetic # Lo KATAKANA DIGRAPH KOTO
> -3105..312D ; Alphabetic # Lo [41] BOPOMOFO LETTER B..BOPOMOFO LETTER IH
> +3105..312E ; Alphabetic # Lo [42] BOPOMOFO LETTER B..BOPOMOFO LETTER O WITH DOT ABOVE
> 3131..318E ; Alphabetic # Lo [94] HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE
> 31A0..31BA ; Alphabetic # Lo [27] BOPOMOFO LETTER BU..BOPOMOFO LETTER ZY
> 31F0..31FF ; Alphabetic # Lo [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO
> 3400..4DB5 ; Alphabetic # Lo [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5
> -4E00..9FD5 ; Alphabetic # Lo [20950] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FD5
> +4E00..9FEA ; Alphabetic # Lo [20971] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FEA
> A000..A014 ; Alphabetic # Lo [21] YI SYLLABLE IT..YI SYLLABLE E
> A015 ; Alphabetic # Lm YI SYLLABLE WU
> A016..A48C ; Alphabetic # Lo [1143] YI SYLLABLE BIT..YI SYLLABLE YYR
> @@ -955,7 +958,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
> 10280..1029C ; Alphabetic # Lo [29] LYCIAN LETTER A..LYCIAN LETTER X
> 102A0..102D0 ; Alphabetic # Lo [49] CARIAN LETTER A..CARIAN LETTER UUU3
> 10300..1031F ; Alphabetic # Lo [32] OLD ITALIC LETTER A..OLD ITALIC LETTER ESS
> -10330..10340 ; Alphabetic # Lo [17] GOTHIC LETTER AHSA..GOTHIC LETTER PAIRTHRA
> +1032D..10340 ; Alphabetic # Lo [20] OLD ITALIC LETTER YE..GOTHIC LETTER PAIRTHRA
> 10341 ; Alphabetic # Nl GOTHIC LETTER NINETY
> 10342..10349 ; Alphabetic # Lo [8] GOTHIC LETTER RAIDA..GOTHIC LETTER OTHAL
> 1034A ; Alphabetic # Nl GOTHIC LETTER NINE HUNDRED
> @@ -1115,6 +1118,23 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
> 11727..1172A ; Alphabetic # Mn [4] AHOM VOWEL SIGN AW..AHOM VOWEL SIGN AM
> 118A0..118DF ; Alphabetic # L& [64] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI SMALL LETTER VIYO
> 118FF ; Alphabetic # Lo WARANG CITI OM
> +11A00 ; Alphabetic # Lo ZANABAZAR SQUARE LETTER A
> +11A01..11A06 ; Alphabetic # Mn [6] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL SIGN O
> +11A07..11A08 ; Alphabetic # Mc [2] ZANABAZAR SQUARE VOWEL SIGN AI..ZANABAZAR SQUARE VOWEL SIGN AU
> +11A09..11A0A ; Alphabetic # Mn [2] ZANABAZAR SQUARE VOWEL SIGN REVERSED I..ZANABAZAR SQUARE VOWEL LENGTH MARK
> +11A0B..11A32 ; Alphabetic # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
> +11A35..11A38 ; Alphabetic # Mn [4] ZANABAZAR SQUARE SIGN CANDRABINDU..ZANABAZAR SQUARE SIGN ANUSVARA
> +11A39 ; Alphabetic # Mc ZANABAZAR SQUARE SIGN VISARGA
> +11A3A ; Alphabetic # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
> +11A3B..11A3E ; Alphabetic # Mn [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA
> +11A50 ; Alphabetic # Lo SOYOMBO LETTER A
> +11A51..11A56 ; Alphabetic # Mn [6] SOYOMBO VOWEL SIGN I..SOYOMBO VOWEL SIGN OE
> +11A57..11A58 ; Alphabetic # Mc [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU
> +11A59..11A5B ; Alphabetic # Mn [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK
> +11A5C..11A83 ; Alphabetic # Lo [40] SOYOMBO LETTER KA..SOYOMBO LETTER KSSA
> +11A86..11A89 ; Alphabetic # Lo [4] SOYOMBO CLUSTER-INITIAL LETTER RA..SOYOMBO CLUSTER-INITIAL LETTER SA
> +11A8A..11A96 ; Alphabetic # Mn [13] SOYOMBO FINAL CONSONANT SIGN G..SOYOMBO SIGN ANUSVARA
> +11A97 ; Alphabetic # Mc SOYOMBO SIGN VISARGA
> 11AC0..11AF8 ; Alphabetic # Lo [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL
> 11C00..11C08 ; Alphabetic # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
> 11C0A..11C2E ; Alphabetic # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
> @@ -1131,6 +1151,16 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
> 11CB2..11CB3 ; Alphabetic # Mn [2] MARCHEN VOWEL SIGN U..MARCHEN VOWEL SIGN E
> 11CB4 ; Alphabetic # Mc MARCHEN VOWEL SIGN O
> 11CB5..11CB6 ; Alphabetic # Mn [2] MARCHEN SIGN ANUSVARA..MARCHEN SIGN CANDRABINDU
> +11D00..11D06 ; Alphabetic # Lo [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E
> +11D08..11D09 ; Alphabetic # Lo [2] MASARAM GONDI LETTER AI..MASARAM GONDI LETTER O
> +11D0B..11D30 ; Alphabetic # Lo [38] MASARAM GONDI LETTER AU..MASARAM GONDI LETTER TRA
> +11D31..11D36 ; Alphabetic # Mn [6] MASARAM GONDI VOWEL SIGN AA..MASARAM GONDI VOWEL SIGN VOCALIC R
> +11D3A ; Alphabetic # Mn MASARAM GONDI VOWEL SIGN E
> +11D3C..11D3D ; Alphabetic # Mn [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O
> +11D3F..11D41 ; Alphabetic # Mn [3] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI SIGN VISARGA
> +11D43 ; Alphabetic # Mn MASARAM GONDI SIGN CANDRA
> +11D46 ; Alphabetic # Lo MASARAM GONDI REPHA
> +11D47 ; Alphabetic # Mn MASARAM GONDI RA-KARA
> 12000..12399 ; Alphabetic # Lo [922] CUNEIFORM SIGN A..CUNEIFORM SIGN U U
> 12400..1246E ; Alphabetic # Nl [111] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM
> 12480..12543 ; Alphabetic # Lo [196] CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU
> @@ -1148,10 +1178,11 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
> 16F50 ; Alphabetic # Lo MIAO LETTER NASALIZATION
> 16F51..16F7E ; Alphabetic # Mc [46] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN NG
> 16F93..16F9F ; Alphabetic # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
> -16FE0 ; Alphabetic # Lm TANGUT ITERATION MARK
> +16FE0..16FE1 ; Alphabetic # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
> 17000..187EC ; Alphabetic # Lo [6125] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187EC
> 18800..18AF2 ; Alphabetic # Lo [755] TANGUT COMPONENT-001..TANGUT COMPONENT-755
> -1B000..1B001 ; Alphabetic # Lo [2] KATAKANA LETTER ARCHAIC E..HIRAGANA LETTER ARCHAIC YE
> +1B000..1B11E ; Alphabetic # Lo [287] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER N-MU-MO-2
> +1B170..1B2FB ; Alphabetic # Lo [396] NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
> 1BC00..1BC6A ; Alphabetic # Lo [107] DUPLOYAN LETTER H..DUPLOYAN LETTER VOCALIC M
> 1BC70..1BC7C ; Alphabetic # Lo [13] DUPLOYAN AFFIX LEFT HORIZONTAL SECANT..DUPLOYAN AFFIX ATTACHED TANGENT HOOK
> 1BC80..1BC88 ; Alphabetic # Lo [9] DUPLOYAN AFFIX HIGH ACUTE..DUPLOYAN AFFIX HIGH VERTICAL
> @@ -1235,9 +1266,10 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
> 2A700..2B734 ; Alphabetic # Lo [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734
> 2B740..2B81D ; Alphabetic # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
> 2B820..2CEA1 ; Alphabetic # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
> +2CEB0..2EBE0 ; Alphabetic # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
> 2F800..2FA1D ; Alphabetic # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
>
OK.
> -# Total code points: 118240
> +# Total code points: 126629
>
> # ================================================
>
> @@ -2798,6 +2830,7 @@ FF41..FF5A ; Cased # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN
> 0AC7..0AC8 ; Case_Ignorable # Mn [2] GUJARATI VOWEL SIGN E..GUJARATI VOWEL SIGN AI
> 0ACD ; Case_Ignorable # Mn GUJARATI SIGN VIRAMA
> 0AE2..0AE3 ; Case_Ignorable # Mn [2] GUJARATI VOWEL SIGN VOCALIC L..GUJARATI VOWEL SIGN VOCALIC LL
> +0AFA..0AFF ; Case_Ignorable # Mn [6] GUJARATI SIGN SUKUN..GUJARATI SIGN TWO-CIRCLE NUKTA ABOVE
> 0B01 ; Case_Ignorable # Mn ORIYA SIGN CANDRABINDU
> 0B3C ; Case_Ignorable # Mn ORIYA SIGN NUKTA
> 0B3F ; Case_Ignorable # Mn ORIYA VOWEL SIGN I
> @@ -2820,7 +2853,8 @@ FF41..FF5A ; Cased # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN
> 0CC6 ; Case_Ignorable # Mn KANNADA VOWEL SIGN E
> 0CCC..0CCD ; Case_Ignorable # Mn [2] KANNADA VOWEL SIGN AU..KANNADA SIGN VIRAMA
> 0CE2..0CE3 ; Case_Ignorable # Mn [2] KANNADA VOWEL SIGN VOCALIC L..KANNADA VOWEL SIGN VOCALIC LL
> -0D01 ; Case_Ignorable # Mn MALAYALAM SIGN CANDRABINDU
> +0D00..0D01 ; Case_Ignorable # Mn [2] MALAYALAM SIGN COMBINING ANUSVARA ABOVE..MALAYALAM SIGN CANDRABINDU
> +0D3B..0D3C ; Case_Ignorable # Mn [2] MALAYALAM SIGN VERTICAL BAR VIRAMA..MALAYALAM SIGN CIRCULAR VIRAMA
> 0D41..0D44 ; Case_Ignorable # Mn [4] MALAYALAM VOWEL SIGN U..MALAYALAM VOWEL SIGN VOCALIC RR
> 0D4D ; Case_Ignorable # Mn MALAYALAM SIGN VIRAMA
> 0D62..0D63 ; Case_Ignorable # Mn [2] MALAYALAM VOWEL SIGN VOCALIC L..MALAYALAM VOWEL SIGN VOCALIC LL
> @@ -2916,7 +2950,7 @@ FF41..FF5A ; Cased # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN
> 1D2C..1D6A ; Case_Ignorable # Lm [63] MODIFIER LETTER CAPITAL A..GREEK SUBSCRIPT SMALL LETTER CHI
> 1D78 ; Case_Ignorable # Lm MODIFIER LETTER CYRILLIC EN
> 1D9B..1DBF ; Case_Ignorable # Lm [37] MODIFIER LETTER SMALL TURNED ALPHA..MODIFIER LETTER SMALL THETA
> -1DC0..1DF5 ; Case_Ignorable # Mn [54] COMBINING DOTTED GRAVE ACCENT..COMBINING UP TACK ABOVE
> +1DC0..1DF9 ; Case_Ignorable # Mn [58] COMBINING DOTTED GRAVE ACCENT..COMBINING WIDE INVERTED BRIDGE BELOW
> 1DFB..1DFF ; Case_Ignorable # Mn [5] COMBINING DELETION MARK..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW
> 1FBD ; Case_Ignorable # Sk GREEK KORONIS
> 1FBF..1FC1 ; Case_Ignorable # Sk [3] GREEK PSILI..GREEK DIALYTIKA AND PERISPOMENI
> @@ -3078,6 +3112,15 @@ FFF9..FFFB ; Case_Ignorable # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLI
> 1171D..1171F ; Case_Ignorable # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
> 11722..11725 ; Case_Ignorable # Mn [4] AHOM VOWEL SIGN I..AHOM VOWEL SIGN UU
> 11727..1172B ; Case_Ignorable # Mn [5] AHOM VOWEL SIGN AW..AHOM SIGN KILLER
> +11A01..11A06 ; Case_Ignorable # Mn [6] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL SIGN O
> +11A09..11A0A ; Case_Ignorable # Mn [2] ZANABAZAR SQUARE VOWEL SIGN REVERSED I..ZANABAZAR SQUARE VOWEL LENGTH MARK
> +11A33..11A38 ; Case_Ignorable # Mn [6] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN ANUSVARA
> +11A3B..11A3E ; Case_Ignorable # Mn [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA
> +11A47 ; Case_Ignorable # Mn ZANABAZAR SQUARE SUBJOINER
> +11A51..11A56 ; Case_Ignorable # Mn [6] SOYOMBO VOWEL SIGN I..SOYOMBO VOWEL SIGN OE
> +11A59..11A5B ; Case_Ignorable # Mn [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK
> +11A8A..11A96 ; Case_Ignorable # Mn [13] SOYOMBO FINAL CONSONANT SIGN G..SOYOMBO SIGN ANUSVARA
> +11A98..11A99 ; Case_Ignorable # Mn [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER
> 11C30..11C36 ; Case_Ignorable # Mn [7] BHAIKSUKI VOWEL SIGN I..BHAIKSUKI VOWEL SIGN VOCALIC L
> 11C38..11C3D ; Case_Ignorable # Mn [6] BHAIKSUKI VOWEL SIGN E..BHAIKSUKI SIGN ANUSVARA
> 11C3F ; Case_Ignorable # Mn BHAIKSUKI SIGN VIRAMA
> @@ -3085,12 +3128,17 @@ FFF9..FFFB ; Case_Ignorable # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLI
> 11CAA..11CB0 ; Case_Ignorable # Mn [7] MARCHEN SUBJOINED LETTER RA..MARCHEN VOWEL SIGN AA
> 11CB2..11CB3 ; Case_Ignorable # Mn [2] MARCHEN VOWEL SIGN U..MARCHEN VOWEL SIGN E
> 11CB5..11CB6 ; Case_Ignorable # Mn [2] MARCHEN SIGN ANUSVARA..MARCHEN SIGN CANDRABINDU
> +11D31..11D36 ; Case_Ignorable # Mn [6] MASARAM GONDI VOWEL SIGN AA..MASARAM GONDI VOWEL SIGN VOCALIC R
> +11D3A ; Case_Ignorable # Mn MASARAM GONDI VOWEL SIGN E
> +11D3C..11D3D ; Case_Ignorable # Mn [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O
> +11D3F..11D45 ; Case_Ignorable # Mn [7] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI VIRAMA
> +11D47 ; Case_Ignorable # Mn MASARAM GONDI RA-KARA
> 16AF0..16AF4 ; Case_Ignorable # Mn [5] BASSA VAH COMBINING HIGH TONE..BASSA VAH COMBINING HIGH-LOW TONE
> 16B30..16B36 ; Case_Ignorable # Mn [7] PAHAWH HMONG MARK CIM TUB..PAHAWH HMONG MARK CIM TAUM
> 16B40..16B43 ; Case_Ignorable # Lm [4] PAHAWH HMONG SIGN VOS SEEV..PAHAWH HMONG SIGN IB YAM
> 16F8F..16F92 ; Case_Ignorable # Mn [4] MIAO TONE RIGHT..MIAO TONE BELOW
> 16F93..16F9F ; Case_Ignorable # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
> -16FE0 ; Case_Ignorable # Lm TANGUT ITERATION MARK
> +16FE0..16FE1 ; Case_Ignorable # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
> 1BC9D..1BC9E ; Case_Ignorable # Mn [2] DUPLOYAN THICK LETTER SELECTOR..DUPLOYAN DOUBLE MARK
> 1BCA0..1BCA3 ; Case_Ignorable # Cf [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
> 1D167..1D169 ; Case_Ignorable # Mn [3] MUSICAL SYMBOL COMBINING TREMOLO-1..MUSICAL SYMBOL COMBINING TREMOLO-3
> @@ -3117,7 +3165,7 @@ E0001 ; Case_Ignorable # Cf LANGUAGE TAG
> E0020..E007F ; Case_Ignorable # Cf [96] TAG SPACE..CANCEL TAG
> E0100..E01EF ; Case_Ignorable # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
>
> -# Total code points: 2240
> +# Total code points: 2314
OK.
>
> # ================================================
>
> @@ -5763,6 +5811,7 @@ FF41..FF5A ; Changes_When_Casemapped # L& [26] FULLWIDTH LATIN SMALL LETTER
> 0824 ; ID_Start # Lm SAMARITAN MODIFIER LETTER SHORT A
> 0828 ; ID_Start # Lm SAMARITAN MODIFIER LETTER I
> 0840..0858 ; ID_Start # Lo [25] MANDAIC LETTER HALQA..MANDAIC LETTER AIN
> +0860..086A ; ID_Start # Lo [11] SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MALAYALAM SSA
> 08A0..08B4 ; ID_Start # Lo [21] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER KAF WITH DOT BELOW
> 08B6..08BD ; ID_Start # Lo [8] ARABIC LETTER BEH WITH SMALL MEEM ABOVE..ARABIC LETTER AFRICAN NOON
> 0904..0939 ; ID_Start # Lo [54] DEVANAGARI LETTER SHORT A..DEVANAGARI LETTER HA
> @@ -5782,6 +5831,7 @@ FF41..FF5A ; Changes_When_Casemapped # L& [26] FULLWIDTH LATIN SMALL LETTER
> 09DC..09DD ; ID_Start # Lo [2] BENGALI LETTER RRA..BENGALI LETTER RHA
> 09DF..09E1 ; ID_Start # Lo [3] BENGALI LETTER YYA..BENGALI LETTER VOCALIC LL
> 09F0..09F1 ; ID_Start # Lo [2] BENGALI LETTER RA WITH MIDDLE DIAGONAL..BENGALI LETTER RA WITH LOWER DIAGONAL
> +09FC ; ID_Start # Lo BENGALI LETTER VEDIC ANUSVARA
> 0A05..0A0A ; ID_Start # Lo [6] GURMUKHI LETTER A..GURMUKHI LETTER UU
> 0A0F..0A10 ; ID_Start # Lo [2] GURMUKHI LETTER EE..GURMUKHI LETTER AI
> 0A13..0A28 ; ID_Start # Lo [22] GURMUKHI LETTER OO..GURMUKHI LETTER NA
> @@ -6039,12 +6089,12 @@ FF41..FF5A ; Changes_When_Casemapped # L& [26] FULLWIDTH LATIN SMALL LETTER
> 30A1..30FA ; ID_Start # Lo [90] KATAKANA LETTER SMALL A..KATAKANA LETTER VO
> 30FC..30FE ; ID_Start # Lm [3] KATAKANA-HIRAGANA PROLONGED SOUND MARK..KATAKANA VOICED ITERATION MARK
> 30FF ; ID_Start # Lo KATAKANA DIGRAPH KOTO
> -3105..312D ; ID_Start # Lo [41] BOPOMOFO LETTER B..BOPOMOFO LETTER IH
> +3105..312E ; ID_Start # Lo [42] BOPOMOFO LETTER B..BOPOMOFO LETTER O WITH DOT ABOVE
> 3131..318E ; ID_Start # Lo [94] HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE
> 31A0..31BA ; ID_Start # Lo [27] BOPOMOFO LETTER BU..BOPOMOFO LETTER ZY
> 31F0..31FF ; ID_Start # Lo [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO
> 3400..4DB5 ; ID_Start # Lo [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5
> -4E00..9FD5 ; ID_Start # Lo [20950] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FD5
> +4E00..9FEA ; ID_Start # Lo [20971] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FEA
> A000..A014 ; ID_Start # Lo [21] YI SYLLABLE IT..YI SYLLABLE E
> A015 ; ID_Start # Lm YI SYLLABLE WU
> A016..A48C ; ID_Start # Lo [1143] YI SYLLABLE BIT..YI SYLLABLE YYR
> @@ -6162,7 +6212,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 10280..1029C ; ID_Start # Lo [29] LYCIAN LETTER A..LYCIAN LETTER X
> 102A0..102D0 ; ID_Start # Lo [49] CARIAN LETTER A..CARIAN LETTER UUU3
> 10300..1031F ; ID_Start # Lo [32] OLD ITALIC LETTER A..OLD ITALIC LETTER ESS
> -10330..10340 ; ID_Start # Lo [17] GOTHIC LETTER AHSA..GOTHIC LETTER PAIRTHRA
> +1032D..10340 ; ID_Start # Lo [20] OLD ITALIC LETTER YE..GOTHIC LETTER PAIRTHRA
> 10341 ; ID_Start # Nl GOTHIC LETTER NINETY
> 10342..10349 ; ID_Start # Lo [8] GOTHIC LETTER RAIDA..GOTHIC LETTER OTHAL
> 1034A ; ID_Start # Nl GOTHIC LETTER NINE HUNDRED
> @@ -6249,11 +6299,21 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 11700..11719 ; ID_Start # Lo [26] AHOM LETTER KA..AHOM LETTER JHA
> 118A0..118DF ; ID_Start # L& [64] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI SMALL LETTER VIYO
> 118FF ; ID_Start # Lo WARANG CITI OM
> +11A00 ; ID_Start # Lo ZANABAZAR SQUARE LETTER A
> +11A0B..11A32 ; ID_Start # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
> +11A3A ; ID_Start # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
> +11A50 ; ID_Start # Lo SOYOMBO LETTER A
> +11A5C..11A83 ; ID_Start # Lo [40] SOYOMBO LETTER KA..SOYOMBO LETTER KSSA
> +11A86..11A89 ; ID_Start # Lo [4] SOYOMBO CLUSTER-INITIAL LETTER RA..SOYOMBO CLUSTER-INITIAL LETTER SA
> 11AC0..11AF8 ; ID_Start # Lo [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL
> 11C00..11C08 ; ID_Start # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
> 11C0A..11C2E ; ID_Start # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
> 11C40 ; ID_Start # Lo BHAIKSUKI SIGN AVAGRAHA
> 11C72..11C8F ; ID_Start # Lo [30] MARCHEN LETTER KA..MARCHEN LETTER A
> +11D00..11D06 ; ID_Start # Lo [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E
> +11D08..11D09 ; ID_Start # Lo [2] MASARAM GONDI LETTER AI..MASARAM GONDI LETTER O
> +11D0B..11D30 ; ID_Start # Lo [38] MASARAM GONDI LETTER AU..MASARAM GONDI LETTER TRA
> +11D46 ; ID_Start # Lo MASARAM GONDI REPHA
> 12000..12399 ; ID_Start # Lo [922] CUNEIFORM SIGN A..CUNEIFORM SIGN U U
> 12400..1246E ; ID_Start # Nl [111] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM
> 12480..12543 ; ID_Start # Lo [196] CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU
> @@ -6269,10 +6329,11 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 16F00..16F44 ; ID_Start # Lo [69] MIAO LETTER PA..MIAO LETTER HHA
> 16F50 ; ID_Start # Lo MIAO LETTER NASALIZATION
> 16F93..16F9F ; ID_Start # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
> -16FE0 ; ID_Start # Lm TANGUT ITERATION MARK
> +16FE0..16FE1 ; ID_Start # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
> 17000..187EC ; ID_Start # Lo [6125] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187EC
> 18800..18AF2 ; ID_Start # Lo [755] TANGUT COMPONENT-001..TANGUT COMPONENT-755
> -1B000..1B001 ; ID_Start # Lo [2] KATAKANA LETTER ARCHAIC E..HIRAGANA LETTER ARCHAIC YE
> +1B000..1B11E ; ID_Start # Lo [287] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER N-MU-MO-2
> +1B170..1B2FB ; ID_Start # Lo [396] NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
> 1BC00..1BC6A ; ID_Start # Lo [107] DUPLOYAN LETTER H..DUPLOYAN LETTER VOCALIC M
> 1BC70..1BC7C ; ID_Start # Lo [13] DUPLOYAN AFFIX LEFT HORIZONTAL SECANT..DUPLOYAN AFFIX ATTACHED TANGENT HOOK
> 1BC80..1BC88 ; ID_Start # Lo [9] DUPLOYAN AFFIX HIGH ACUTE..DUPLOYAN AFFIX HIGH VERTICAL
> @@ -6346,9 +6407,10 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 2A700..2B734 ; ID_Start # Lo [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734
> 2B740..2B81D ; ID_Start # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
> 2B820..2CEA1 ; ID_Start # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
> +2CEB0..2EBE0 ; ID_Start # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
> 2F800..2FA1D ; ID_Start # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
>
> -# Total code points: 117007
> +# Total code points: 125334
OK.
>
> # ================================================
>
> @@ -6451,6 +6513,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 0829..082D ; ID_Continue # Mn [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA
> 0840..0858 ; ID_Continue # Lo [25] MANDAIC LETTER HALQA..MANDAIC LETTER AIN
> 0859..085B ; ID_Continue # Mn [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK
> +0860..086A ; ID_Continue # Lo [11] SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MALAYALAM SSA
> 08A0..08B4 ; ID_Continue # Lo [21] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER KAF WITH DOT BELOW
> 08B6..08BD ; ID_Continue # Lo [8] ARABIC LETTER BEH WITH SMALL MEEM ABOVE..ARABIC LETTER AFRICAN NOON
> 08D4..08E1 ; ID_Continue # Mn [14] ARABIC SMALL HIGH WORD AR-RUB..ARABIC SMALL HIGH SIGN SAFHA
> @@ -6495,6 +6558,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 09E2..09E3 ; ID_Continue # Mn [2] BENGALI VOWEL SIGN VOCALIC L..BENGALI VOWEL SIGN VOCALIC LL
> 09E6..09EF ; ID_Continue # Nd [10] BENGALI DIGIT ZERO..BENGALI DIGIT NINE
> 09F0..09F1 ; ID_Continue # Lo [2] BENGALI LETTER RA WITH MIDDLE DIAGONAL..BENGALI LETTER RA WITH LOWER DIAGONAL
> +09FC ; ID_Continue # Lo BENGALI LETTER VEDIC ANUSVARA
> 0A01..0A02 ; ID_Continue # Mn [2] GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN BINDI
> 0A03 ; ID_Continue # Mc GURMUKHI SIGN VISARGA
> 0A05..0A0A ; ID_Continue # Lo [6] GURMUKHI LETTER A..GURMUKHI LETTER UU
> @@ -6537,6 +6601,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 0AE2..0AE3 ; ID_Continue # Mn [2] GUJARATI VOWEL SIGN VOCALIC L..GUJARATI VOWEL SIGN VOCALIC LL
> 0AE6..0AEF ; ID_Continue # Nd [10] GUJARATI DIGIT ZERO..GUJARATI DIGIT NINE
> 0AF9 ; ID_Continue # Lo GUJARATI LETTER ZHA
> +0AFA..0AFF ; ID_Continue # Mn [6] GUJARATI SIGN SUKUN..GUJARATI SIGN TWO-CIRCLE NUKTA ABOVE
> 0B01 ; ID_Continue # Mn ORIYA SIGN CANDRABINDU
> 0B02..0B03 ; ID_Continue # Mc [2] ORIYA SIGN ANUSVARA..ORIYA SIGN VISARGA
> 0B05..0B0C ; ID_Continue # Lo [8] ORIYA LETTER A..ORIYA LETTER VOCALIC L
> @@ -6620,11 +6685,12 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 0CE2..0CE3 ; ID_Continue # Mn [2] KANNADA VOWEL SIGN VOCALIC L..KANNADA VOWEL SIGN VOCALIC LL
> 0CE6..0CEF ; ID_Continue # Nd [10] KANNADA DIGIT ZERO..KANNADA DIGIT NINE
> 0CF1..0CF2 ; ID_Continue # Lo [2] KANNADA SIGN JIHVAMULIYA..KANNADA SIGN UPADHMANIYA
> -0D01 ; ID_Continue # Mn MALAYALAM SIGN CANDRABINDU
> +0D00..0D01 ; ID_Continue # Mn [2] MALAYALAM SIGN COMBINING ANUSVARA ABOVE..MALAYALAM SIGN CANDRABINDU
> 0D02..0D03 ; ID_Continue # Mc [2] MALAYALAM SIGN ANUSVARA..MALAYALAM SIGN VISARGA
> 0D05..0D0C ; ID_Continue # Lo [8] MALAYALAM LETTER A..MALAYALAM LETTER VOCALIC L
> 0D0E..0D10 ; ID_Continue # Lo [3] MALAYALAM LETTER E..MALAYALAM LETTER AI
> 0D12..0D3A ; ID_Continue # Lo [41] MALAYALAM LETTER O..MALAYALAM LETTER TTTA
> +0D3B..0D3C ; ID_Continue # Mn [2] MALAYALAM SIGN VERTICAL BAR VIRAMA..MALAYALAM SIGN CIRCULAR VIRAMA
> 0D3D ; ID_Continue # Lo MALAYALAM SIGN AVAGRAHA
> 0D3E..0D40 ; ID_Continue # Mc [3] MALAYALAM VOWEL SIGN AA..MALAYALAM VOWEL SIGN II
> 0D41..0D44 ; ID_Continue # Mn [4] MALAYALAM VOWEL SIGN U..MALAYALAM VOWEL SIGN VOCALIC RR
> @@ -6888,6 +6954,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 1CF2..1CF3 ; ID_Continue # Mc [2] VEDIC SIGN ARDHAVISARGA..VEDIC SIGN ROTATED ARDHAVISARGA
> 1CF4 ; ID_Continue # Mn VEDIC TONE CANDRA ABOVE
> 1CF5..1CF6 ; ID_Continue # Lo [2] VEDIC SIGN JIHVAMULIYA..VEDIC SIGN UPADHMANIYA
> +1CF7 ; ID_Continue # Mc VEDIC SIGN ATIKRAMA
> 1CF8..1CF9 ; ID_Continue # Mn [2] VEDIC TONE RING ABOVE..VEDIC TONE DOUBLE RING ABOVE
> 1D00..1D2B ; ID_Continue # L& [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL
> 1D2C..1D6A ; ID_Continue # Lm [63] MODIFIER LETTER CAPITAL A..GREEK SUBSCRIPT SMALL LETTER CHI
> @@ -6895,7 +6962,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 1D78 ; ID_Continue # Lm MODIFIER LETTER CYRILLIC EN
> 1D79..1D9A ; ID_Continue # L& [34] LATIN SMALL LETTER INSULAR G..LATIN SMALL LETTER EZH WITH RETROFLEX HOOK
> 1D9B..1DBF ; ID_Continue # Lm [37] MODIFIER LETTER SMALL TURNED ALPHA..MODIFIER LETTER SMALL THETA
> -1DC0..1DF5 ; ID_Continue # Mn [54] COMBINING DOTTED GRAVE ACCENT..COMBINING UP TACK ABOVE
> +1DC0..1DF9 ; ID_Continue # Mn [58] COMBINING DOTTED GRAVE ACCENT..COMBINING WIDE INVERTED BRIDGE BELOW
> 1DFB..1DFF ; ID_Continue # Mn [5] COMBINING DELETION MARK..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW
> 1E00..1F15 ; ID_Continue # L& [278] LATIN CAPITAL LETTER A WITH RING BELOW..GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA
> 1F18..1F1D ; ID_Continue # L& [6] GREEK CAPITAL LETTER EPSILON WITH PSILI..GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA
> @@ -6986,12 +7053,12 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
> 30A1..30FA ; ID_Continue # Lo [90] KATAKANA LETTER SMALL A..KATAKANA LETTER VO
> 30FC..30FE ; ID_Continue # Lm [3] KATAKANA-HIRAGANA PROLONGED SOUND MARK..KATAKANA VOICED ITERATION MARK
> 30FF ; ID_Continue # Lo KATAKANA DIGRAPH KOTO
> -3105..312D ; ID_Continue # Lo [41] BOPOMOFO LETTER B..BOPOMOFO LETTER IH
> +3105..312E ; ID_Continue # Lo [42] BOPOMOFO LETTER B..BOPOMOFO LETTER O WITH DOT ABOVE
> 3131..318E ; ID_Continue # Lo [94] HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE
> 31A0..31BA ; ID_Continue # Lo [27] BOPOMOFO LETTER BU..BOPOMOFO LETTER ZY
> 31F0..31FF ; ID_Continue # Lo [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO
> 3400..4DB5 ; ID_Continue # Lo [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5
> -4E00..9FD5 ; ID_Continue # Lo [20950] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FD5
> +4E00..9FEA ; ID_Continue # Lo [20971] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FEA
> A000..A014 ; ID_Continue # Lo [21] YI SYLLABLE IT..YI SYLLABLE E
> A015 ; ID_Continue # Lm YI SYLLABLE WU
> A016..A48C ; ID_Continue # Lo [1143] YI SYLLABLE BIT..YI SYLLABLE YYR
> @@ -7179,7 +7246,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
> 102A0..102D0 ; ID_Continue # Lo [49] CARIAN LETTER A..CARIAN LETTER UUU3
> 102E0 ; ID_Continue # Mn COPTIC EPACT THOUSANDS MARK
> 10300..1031F ; ID_Continue # Lo [32] OLD ITALIC LETTER A..OLD ITALIC LETTER ESS
> -10330..10340 ; ID_Continue # Lo [17] GOTHIC LETTER AHSA..GOTHIC LETTER PAIRTHRA
> +1032D..10340 ; ID_Continue # Lo [20] OLD ITALIC LETTER YE..GOTHIC LETTER PAIRTHRA
> 10341 ; ID_Continue # Nl GOTHIC LETTER NINETY
> 10342..10349 ; ID_Continue # Lo [8] GOTHIC LETTER RAIDA..GOTHIC LETTER OTHAL
> 1034A ; ID_Continue # Nl GOTHIC LETTER NINE HUNDRED
> @@ -7367,6 +7434,25 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
> 118A0..118DF ; ID_Continue # L& [64] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI SMALL LETTER VIYO
> 118E0..118E9 ; ID_Continue # Nd [10] WARANG CITI DIGIT ZERO..WARANG CITI DIGIT NINE
> 118FF ; ID_Continue # Lo WARANG CITI OM
> +11A00 ; ID_Continue # Lo ZANABAZAR SQUARE LETTER A
> +11A01..11A06 ; ID_Continue # Mn [6] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL SIGN O
> +11A07..11A08 ; ID_Continue # Mc [2] ZANABAZAR SQUARE VOWEL SIGN AI..ZANABAZAR SQUARE VOWEL SIGN AU
> +11A09..11A0A ; ID_Continue # Mn [2] ZANABAZAR SQUARE VOWEL SIGN REVERSED I..ZANABAZAR SQUARE VOWEL LENGTH MARK
> +11A0B..11A32 ; ID_Continue # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
> +11A33..11A38 ; ID_Continue # Mn [6] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN ANUSVARA
> +11A39 ; ID_Continue # Mc ZANABAZAR SQUARE SIGN VISARGA
> +11A3A ; ID_Continue # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
> +11A3B..11A3E ; ID_Continue # Mn [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA
> +11A47 ; ID_Continue # Mn ZANABAZAR SQUARE SUBJOINER
> +11A50 ; ID_Continue # Lo SOYOMBO LETTER A
> +11A51..11A56 ; ID_Continue # Mn [6] SOYOMBO VOWEL SIGN I..SOYOMBO VOWEL SIGN OE
> +11A57..11A58 ; ID_Continue # Mc [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU
> +11A59..11A5B ; ID_Continue # Mn [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK
> +11A5C..11A83 ; ID_Continue # Lo [40] SOYOMBO LETTER KA..SOYOMBO LETTER KSSA
> +11A86..11A89 ; ID_Continue # Lo [4] SOYOMBO CLUSTER-INITIAL LETTER RA..SOYOMBO CLUSTER-INITIAL LETTER SA
> +11A8A..11A96 ; ID_Continue # Mn [13] SOYOMBO FINAL CONSONANT SIGN G..SOYOMBO SIGN ANUSVARA
> +11A97 ; ID_Continue # Mc SOYOMBO SIGN VISARGA
> +11A98..11A99 ; ID_Continue # Mn [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER
> 11AC0..11AF8 ; ID_Continue # Lo [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL
> 11C00..11C08 ; ID_Continue # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
> 11C0A..11C2E ; ID_Continue # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
> @@ -7385,6 +7471,16 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
> 11CB2..11CB3 ; ID_Continue # Mn [2] MARCHEN VOWEL SIGN U..MARCHEN VOWEL SIGN E
> 11CB4 ; ID_Continue # Mc MARCHEN VOWEL SIGN O
> 11CB5..11CB6 ; ID_Continue # Mn [2] MARCHEN SIGN ANUSVARA..MARCHEN SIGN CANDRABINDU
> +11D00..11D06 ; ID_Continue # Lo [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E
> +11D08..11D09 ; ID_Continue # Lo [2] MASARAM GONDI LETTER AI..MASARAM GONDI LETTER O
> +11D0B..11D30 ; ID_Continue # Lo [38] MASARAM GONDI LETTER AU..MASARAM GONDI LETTER TRA
> +11D31..11D36 ; ID_Continue # Mn [6] MASARAM GONDI VOWEL SIGN AA..MASARAM GONDI VOWEL SIGN VOCALIC R
> +11D3A ; ID_Continue # Mn MASARAM GONDI VOWEL SIGN E
> +11D3C..11D3D ; ID_Continue # Mn [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O
> +11D3F..11D45 ; ID_Continue # Mn [7] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI VIRAMA
> +11D46 ; ID_Continue # Lo MASARAM GONDI REPHA
> +11D47 ; ID_Continue # Mn MASARAM GONDI RA-KARA
> +11D50..11D59 ; ID_Continue # Nd [10] MASARAM GONDI DIGIT ZERO..MASARAM GONDI DIGIT NINE
> 12000..12399 ; ID_Continue # Lo [922] CUNEIFORM SIGN A..CUNEIFORM SIGN U U
> 12400..1246E ; ID_Continue # Nl [111] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM
> 12480..12543 ; ID_Continue # Lo [196] CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU
> @@ -7406,10 +7502,11 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
> 16F51..16F7E ; ID_Continue # Mc [46] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN NG
> 16F8F..16F92 ; ID_Continue # Mn [4] MIAO TONE RIGHT..MIAO TONE BELOW
> 16F93..16F9F ; ID_Continue # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
> -16FE0 ; ID_Continue # Lm TANGUT ITERATION MARK
> +16FE0..16FE1 ; ID_Continue # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
> 17000..187EC ; ID_Continue # Lo [6125] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187EC
> 18800..18AF2 ; ID_Continue # Lo [755] TANGUT COMPONENT-001..TANGUT COMPONENT-755
> -1B000..1B001 ; ID_Continue # Lo [2] KATAKANA LETTER ARCHAIC E..HIRAGANA LETTER ARCHAIC YE
> +1B000..1B11E ; ID_Continue # Lo [287] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER N-MU-MO-2
> +1B170..1B2FB ; ID_Continue # Lo [396] NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
> 1BC00..1BC6A ; ID_Continue # Lo [107] DUPLOYAN LETTER H..DUPLOYAN LETTER VOCALIC M
> 1BC70..1BC7C ; ID_Continue # Lo [13] DUPLOYAN AFFIX LEFT HORIZONTAL SECANT..DUPLOYAN AFFIX ATTACHED TANGENT HOOK
> 1BC80..1BC88 ; ID_Continue # Lo [9] DUPLOYAN AFFIX HIGH ACUTE..DUPLOYAN AFFIX HIGH VERTICAL
> @@ -7506,10 +7603,11 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
> 2A700..2B734 ; ID_Continue # Lo [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734
> 2B740..2B81D ; ID_Continue # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
> 2B820..2CEA1 ; ID_Continue # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
> +2CEB0..2EBE0 ; ID_Continue # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
> 2F800..2FA1D ; ID_Continue # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
> E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
>
> -# Total code points: 119691
> +# Total code points: 128108
OK.
...
> diff --git a/localedata/unicode-gen/Makefile b/localedata/unicode-gen/Makefile
> index e38c624f3f..d62603ed3d 100644
> --- a/localedata/unicode-gen/Makefile
> +++ b/localedata/unicode-gen/Makefile
> @@ -35,7 +35,7 @@
> # files for making modifications.
>
>
> -UNICODE_VERSION = 9.0.0
> +UNICODE_VERSION = 10.0.0
OK.
>
> PYTHON3 = python3
> WGET = wget
> diff --git a/localedata/unicode-gen/UnicodeData.txt b/localedata/unicode-gen/UnicodeData.txt
> index a756976461..d89c64f526 100644
> --- a/localedata/unicode-gen/UnicodeData.txt
> +++ b/localedata/unicode-gen/UnicodeData.txt
> @@ -2072,6 +2072,17 @@
OK.
--
Cheers,
Carlos.