Cygwin 3.6 cannot handle Unicode characters outside BMP
Brian Inglis
Brian.Inglis@SystematicSW.ab.ca
Fri Dec 27 15:56:04 GMT 2024
On 2024-12-26 17:48, Takeshi Nishimura via Cygwin wrote:
> Is it a known problem that Cygwin (tested 3.6) cannot handle Unicode
> characters outside the BMP, e.g. Unicode character points above 65535?
For many use cases, UTF-8 SMP characters (like emoji) pass thru properly: see
below for an example.
So in general, no, but there appear to be some issues with some Windows
interfaces handling SMP codes UTF-16 surrogate pairs converted from UTF-8.
It is more useful if you provide a Simple Test Case that demonstrates your
issue, and allows it to be reproduced, diagnosed, and/or explained.
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
$ grep -a 'U+1F[A-F][9A-F][0-9A-F]' unicode-symbols.txt
. U+1FA70..U+1FAFF: Symbols and Pictographs Extended-A
🪐 U+1FA90 RINGED PLANET
🪑 U+1FA91 CHAIR
🪒 U+1FA92 RAZOR
🪓 U+1FA93 AXE
🪔 U+1FA94 DIYA LAMP
🪕 U+1FA95 BANJO
🪖 U+1FA96 MILITARY HELMET
🪗 U+1FA97 ACCORDION
🪘 U+1FA98 LONG DRUM
🪙 U+1FA99 COIN
🪚 U+1FA9A CARPENTRY SAW
🪛 U+1FA9B SCREWDRIVER
🪜 U+1FA9C LADDER
🪝 U+1FA9D HOOK
🪞 U+1FA9E MIRROR
🪟 U+1FA9F WINDOW
🪠 U+1FAA0 PLUNGER
🪡 U+1FAA1 SEWING NEEDLE
🪢 U+1FAA2 KNOT
🪣 U+1FAA3 BUCKET
🪤 U+1FAA4 MOUSE TRAP
🪥 U+1FAA5 TOOTHBRUSH
🪦 U+1FAA6 HEADSTONE
🪧 U+1FAA7 PLACARD
🪨 U+1FAA8 ROCK
🪩 U+1FAA9 MIRROR BALL
🪪 U+1FAAA IDENTIFICATION CARD
🪫 U+1FAAB LOW BATTERY
🪬 U+1FAAC HAMSA
🪭 U+1FAAD FOLDING HAND FAN
🪮 U+1FAAE HAIR PICK
🪯 U+1FAAF KHANDA
🪰 U+1FAB0 FLY
🪱 U+1FAB1 WORM
🪲 U+1FAB2 BEETLE
🪳 U+1FAB3 COCKROACH
🪴 U+1FAB4 POTTED PLANT
🪵 U+1FAB5 WOOD
🪶 U+1FAB6 FEATHER
🪷 U+1FAB7 LOTUS
🪸 U+1FAB8 CORAL
🪹 U+1FAB9 EMPTY NEST
🪺 U+1FABA NEST WITH EGGS
🪻 U+1FABB HYACINTH
🪼 U+1FABC JELLYFISH
🪽 U+1FABD WING
🪿 U+1FABF GOOSE
🫀 U+1FAC0 ANATOMICAL HEART
🫁 U+1FAC1 LUNGS
🫂 U+1FAC2 PEOPLE HUGGING
🫃 U+1FAC3 MAN WITH SWOLLEN BELLY
🫄 U+1FAC4 PERSON WITH SWOLLEN BELLY
🫅 U+1FAC5 PERSON WITH CROWN
🫎 U+1FACE MOOSE
🫏 U+1FACF DONKEY
🫐 U+1FAD0 BLUEBERRIES
🫑 U+1FAD1 BELL PEPPER
🫒 U+1FAD2 OLIVE
🫓 U+1FAD3 FLATBREAD
🫔 U+1FAD4 TAMALE
🫕 U+1FAD5 FONDUE
🫖 U+1FAD6 TEAPOT
🫗 U+1FAD7 POURING LIQUID
🫘 U+1FAD8 BEANS
🫙 U+1FAD9 JAR
🫚 U+1FADA GINGER ROOT
🫛 U+1FADB PEA POD
🫠 U+1FAE0 MELTING FACE
🫡 U+1FAE1 SALUTING FACE
🫢 U+1FAE2 FACE WITH OPEN EYES AND HAND OVER MOUTH
🫣 U+1FAE3 FACE WITH PEEKING EYE
🫤 U+1FAE4 FACE WITH DIAGONAL MOUTH
🫥 U+1FAE5 DOTTED LINE FACE
🫦 U+1FAE6 BITING LIP
🫧 U+1FAE7 BUBBLES
🫨 U+1FAE8 SHAKING FACE
🫰 U+1FAF0 HAND WITH INDEX FINGER AND THUMB CROSSED
🫱 U+1FAF1 RIGHTWARD BACKHAND
🫲 U+1FAF2 LEFTWARD HAND
🫳 U+1FAF3 PALM DOWN HAND
🫴 U+1FAF4 PALM UP HAND
🫵 U+1FAF5 INDEX POINTING AT THE VIEWER
🫶 U+1FAF6 HEART HANDS
🫷 U+1FAF7 LEFTWARDS PUSHING HAND
🫸 U+1FAF8 RIGHTWARDS PUSHING HAND
. U+1FB00..U+1FBFF: Symbols for Legacy Computing
🮐 U+1FB90 INVERSE MEDIUM SHADE
🮑 U+1FB91 UPPER HALF BLOCK AND LOWER HALF INVERSE MEDIUM SHADE
🮒 U+1FB92 UPPER HALF INVERSE MEDIUM SHADE AND LOWER HALF BLOCK
🮔 U+1FB94 LEFT HALF INVERSE MEDIUM SHADE AND RIGHT HALF BLOCK
🮕 U+1FB95 CHECKER BOARD FILL
🮖 U+1FB96 INVERSE CHECKER BOARD FILL
🮗 U+1FB97 HEAVY HORIZONTAL FILL
🮘 U+1FB98 UPPER LEFT TO LOWER RIGHT FILL
🮙 U+1FB99 UPPER RIGHT TO LOWER LEFT FILL
🮚 U+1FB9A UPPER AND LOWER TRIANGULAR HALF BLOCK
🮛 U+1FB9B LEFT AND RIGHT TRIANGULAR HALF BLOCK
🮜 U+1FB9C UPPER LEFT TRIANGULAR MEDIUM SHADE
🮝 U+1FB9D UPPER RIGHT TRIANGULAR MEDIUM SHADE
🮞 U+1FB9E LOWER RIGHT TRIANGULAR MEDIUM SHADE
🮟 U+1FB9F LOWER LEFT TRIANGULAR MEDIUM SHADE
🮠 U+1FBA0 BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT
🮡 U+1FBA1 BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT
🮢 U+1FBA2 BOX DRAWINGS LIGHT DIAGONAL MIDDLE LEFT TO LOWER CENTRE
🮣 U+1FBA3 BOX DRAWINGS LIGHT DIAGONAL MIDDLE RIGHT TO LOWER CENTRE
🮤 U+1FBA4 BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT TO LOWER CENTRE
🮥 U+1FBA5 BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT TO LOWER CENTRE
🮦 U+1FBA6 BOX DRAWINGS LIGHT DIAGONAL MIDDLE LEFT TO LOWER CENTRE TO MIDDLE RIGHT
🮧 U+1FBA7 BOX DRAWINGS LIGHT DIAGONAL MIDDLE LEFT TO UPPER CENTRE TO MIDDLE RIGHT
🮨 U+1FBA8 BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT AND MIDDLE
RIGHT TO LOWER CENTRE
🮩 U+1FBA9 BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT AND MIDDLE
LEFT TO LOWER CENTRE
🮪 U+1FBAA BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT TO LOWER
CENTRE TO MIDDLE LEFT
🮫 U+1FBAB BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT TO LOWER
CENTRE TO MIDDLE RIGHT
🮬 U+1FBAC BOX DRAWINGS LIGHT DIAGONAL MIDDLE LEFT TO UPPER CENTRE TO MIDDLE
RIGHT TO LOWER CENTRE
🮭 U+1FBAD BOX DRAWINGS LIGHT DIAGONAL MIDDLE RIGHT TO UPPER CENTRE TO MIDDLE
LEFT TO LOWER CENTRE
🮮 U+1FBAE BOX DRAWINGS LIGHT DIAGONAL DIAMOND
🮯 U+1FBAF BOX DRAWINGS LIGHT HORIZONTAL WITH VERTICAL STROKE
🮰 U+1FBB0 ARROWHEAD-SHAPED POINTER
🮱 U+1FBB1 INVERSE CHECK MARK
🮲 U+1FBB2 LEFT HALF RUNNING MAN
🮳 U+1FBB3 RIGHT HALF RUNNING MAN
🮴 U+1FBB4 INVERSE DOWNWARDS ARROW WITH TIP LEFTWARDS
🮵 U+1FBB5 LEFTWARDS ARROW AND UPPER AND LOWER ONE EIGHTH BLOCK
🮶 U+1FBB6 RIGHTWARDS ARROW AND UPPER AND LOWER ONE EIGHTH BLOCK
🮷 U+1FBB7 DOWNWARDS ARROW AND RIGHT ONE EIGHTH BLOCK
🮸 U+1FBB8 UPWARDS ARROW AND RIGHT ONE EIGHTH BLOCK
🮹 U+1FBB9 LEFT HALF FOLDER
🮺 U+1FBBA RIGHT HALF FOLDER
🮻 U+1FBBB VOIDED GREEK CROSS
🮼 U+1FBBC RIGHT OPEN SQUARED DOT
🮽 U+1FBBD NEGATIVE DIAGONAL CROSS
🮾 U+1FBBE NEGATIVE DIAGONAL MIDDLE RIGHT TO LOWER CENTRE
🮿 U+1FBBF NEGATIVE DIAGONAL DIAMOND
🯀 U+1FBC0 WHITE HEAVY SALTIRE WITH ROUNDED CORNERS
🯁 U+1FBC1 LEFT THIRD WHITE RIGHT POINTING INDEX
🯂 U+1FBC2 MIDDLE THIRD WHITE RIGHT POINTING INDEX
🯃 U+1FBC3 RIGHT THIRD WHITE RIGHT POINTING INDEX
🯄 U+1FBC4 NEGATIVE SQUARED QUESTION MARK
🯅 U+1FBC5 STICK FIGURE
🯆 U+1FBC6 STICK FIGURE WITH ARMS RAISED
🯇 U+1FBC7 STICK FIGURE LEANING LEFT
🯈 U+1FBC8 STICK FIGURE LEANING RIGHT
🯉 U+1FBC9 STICK FIGURE WITH DRESS
🯊 U+1FBCA WHITE UP-POINTING CHEVRON
🯰 U+1FBF0 SEGMENTED DIGIT ZERO
🯱 U+1FBF1 SEGMENTED DIGIT ONE
🯲 U+1FBF2 SEGMENTED DIGIT TWO
🯳 U+1FBF3 SEGMENTED DIGIT THREE
🯴 U+1FBF4 SEGMENTED DIGIT FOUR
🯵 U+1FBF5 SEGMENTED DIGIT FIVE
🯶 U+1FBF6 SEGMENTED DIGIT SIX
🯷 U+1FBF7 SEGMENTED DIGIT SEVEN
🯸 U+1FBF8 SEGMENTED DIGIT EIGHT
🯹 U+1FBF9 SEGMENTED DIGIT NINE
More information about the Cygwin
mailing list