Cygwin 3.6 cannot handle Unicode characters outside BMP

Brian Inglis Brian.Inglis@SystematicSW.ab.ca
Fri Dec 27 15:56:04 GMT 2024


On 2024-12-26 17:48, Takeshi Nishimura via Cygwin wrote:
> Is it a known problem that Cygwin (tested 3.6) cannot handle Unicode
> characters outside the BMP, e.g. Unicode character points above 65535?

For many use cases, UTF-8 SMP characters (like emoji) pass thru properly: see 
below for an example.

So in general, no, but there appear to be some issues with some Windows 
interfaces handling SMP codes UTF-16 surrogate pairs converted from UTF-8.

It is more useful if you provide a Simple Test Case that demonstrates your 
issue, and allows it to be reproduced, diagnosed, and/or explained.

-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                 -- Antoine de Saint-Exupéry


$ grep -a 'U+1F[A-F][9A-F][0-9A-F]' unicode-symbols.txt
.  U+1FA70..U+1FAFF:	Symbols and Pictographs Extended-A
🪐 U+1FA90  RINGED PLANET
🪑 U+1FA91  CHAIR
🪒 U+1FA92  RAZOR
🪓 U+1FA93  AXE
🪔 U+1FA94  DIYA LAMP
🪕 U+1FA95  BANJO
🪖 U+1FA96  MILITARY HELMET
🪗 U+1FA97  ACCORDION
🪘 U+1FA98  LONG DRUM
🪙 U+1FA99  COIN
🪚 U+1FA9A  CARPENTRY SAW
🪛 U+1FA9B  SCREWDRIVER
🪜 U+1FA9C  LADDER
🪝 U+1FA9D  HOOK
🪞 U+1FA9E  MIRROR
🪟 U+1FA9F  WINDOW
🪠 U+1FAA0  PLUNGER
🪡 U+1FAA1  SEWING NEEDLE
🪢 U+1FAA2  KNOT
🪣 U+1FAA3  BUCKET
🪤 U+1FAA4  MOUSE TRAP
🪥 U+1FAA5  TOOTHBRUSH
🪦 U+1FAA6  HEADSTONE
🪧 U+1FAA7  PLACARD
🪨 U+1FAA8  ROCK
🪩  U+1FAA9  MIRROR BALL
🪪  U+1FAAA  IDENTIFICATION CARD
🪫  U+1FAAB  LOW BATTERY
🪬  U+1FAAC  HAMSA
🪭  U+1FAAD  FOLDING HAND FAN
🪮  U+1FAAE  HAIR PICK
🪯  U+1FAAF  KHANDA
🪰 U+1FAB0  FLY
🪱 U+1FAB1  WORM
🪲 U+1FAB2  BEETLE
🪳 U+1FAB3  COCKROACH
🪴 U+1FAB4  POTTED PLANT
🪵 U+1FAB5  WOOD
🪶 U+1FAB6  FEATHER
🪷  U+1FAB7  LOTUS
🪸  U+1FAB8  CORAL
🪹  U+1FAB9  EMPTY NEST
🪺  U+1FABA  NEST WITH EGGS
🪻  U+1FABB  HYACINTH
🪼  U+1FABC  JELLYFISH
🪽  U+1FABD  WING

🪿  U+1FABF  GOOSE
🫀  U+1FAC0  ANATOMICAL HEART
🫁  U+1FAC1  LUNGS
🫂  U+1FAC2  PEOPLE HUGGING
🫃  U+1FAC3  MAN WITH SWOLLEN BELLY
🫄  U+1FAC4  PERSON WITH SWOLLEN BELLY
🫅  U+1FAC5  PERSON WITH CROWN

🫎  U+1FACE  MOOSE
🫏  U+1FACF  DONKEY
🫐 U+1FAD0  BLUEBERRIES
🫑 U+1FAD1  BELL PEPPER
🫒 U+1FAD2  OLIVE
🫓 U+1FAD3  FLATBREAD
🫔 U+1FAD4  TAMALE
🫕 U+1FAD5  FONDUE
🫖 U+1FAD6  TEAPOT
🫗  U+1FAD7  POURING LIQUID
🫘  U+1FAD8  BEANS
🫙  U+1FAD9  JAR
🫚  U+1FADA  GINGER ROOT
🫛  U+1FADB  PEA POD

🫠  U+1FAE0  MELTING FACE
🫡  U+1FAE1  SALUTING FACE
🫢  U+1FAE2  FACE WITH OPEN EYES AND HAND OVER MOUTH
🫣  U+1FAE3  FACE WITH PEEKING EYE
🫤  U+1FAE4  FACE WITH DIAGONAL MOUTH
🫥  U+1FAE5  DOTTED LINE FACE
🫦  U+1FAE6  BITING LIP
🫧  U+1FAE7  BUBBLES
🫨  U+1FAE8  SHAKING FACE

🫰  U+1FAF0  HAND WITH INDEX FINGER AND THUMB CROSSED
🫱  U+1FAF1  RIGHTWARD BACKHAND
🫲  U+1FAF2  LEFTWARD HAND
🫳  U+1FAF3  PALM DOWN HAND
🫴  U+1FAF4  PALM UP HAND
🫵  U+1FAF5  INDEX POINTING AT THE VIEWER
🫶  U+1FAF6  HEART HANDS
🫷  U+1FAF7  LEFTWARDS PUSHING HAND
🫸  U+1FAF8  RIGHTWARDS PUSHING HAND
.  U+1FB00..U+1FBFF:	Symbols for Legacy Computing
🮐  U+1FB90  INVERSE MEDIUM SHADE
🮑  U+1FB91  UPPER HALF BLOCK AND LOWER HALF INVERSE MEDIUM SHADE
🮒  U+1FB92  UPPER HALF INVERSE MEDIUM SHADE AND LOWER HALF BLOCK
🮔  U+1FB94  LEFT HALF INVERSE MEDIUM SHADE AND RIGHT HALF BLOCK
🮕  U+1FB95  CHECKER BOARD FILL
🮖  U+1FB96  INVERSE CHECKER BOARD FILL
🮗  U+1FB97  HEAVY HORIZONTAL FILL
🮘  U+1FB98  UPPER LEFT TO LOWER RIGHT FILL
🮙  U+1FB99  UPPER RIGHT TO LOWER LEFT FILL
🮚  U+1FB9A  UPPER AND LOWER TRIANGULAR HALF BLOCK
🮛  U+1FB9B  LEFT AND RIGHT TRIANGULAR HALF BLOCK
🮜  U+1FB9C  UPPER LEFT TRIANGULAR MEDIUM SHADE
🮝  U+1FB9D  UPPER RIGHT TRIANGULAR MEDIUM SHADE
🮞  U+1FB9E  LOWER RIGHT TRIANGULAR MEDIUM SHADE
🮟  U+1FB9F  LOWER LEFT TRIANGULAR MEDIUM SHADE
🮠  U+1FBA0  BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT
🮡  U+1FBA1  BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT
🮢  U+1FBA2  BOX DRAWINGS LIGHT DIAGONAL MIDDLE LEFT TO LOWER CENTRE
🮣  U+1FBA3  BOX DRAWINGS LIGHT DIAGONAL MIDDLE RIGHT TO LOWER CENTRE
🮤  U+1FBA4  BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT TO LOWER CENTRE
🮥  U+1FBA5  BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT TO LOWER CENTRE
🮦  U+1FBA6  BOX DRAWINGS LIGHT DIAGONAL MIDDLE LEFT TO LOWER CENTRE TO MIDDLE RIGHT
🮧  U+1FBA7  BOX DRAWINGS LIGHT DIAGONAL MIDDLE LEFT TO UPPER CENTRE TO MIDDLE RIGHT
🮨  U+1FBA8  BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT AND MIDDLE 
RIGHT TO LOWER CENTRE
🮩  U+1FBA9  BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT AND MIDDLE 
LEFT TO LOWER CENTRE
🮪  U+1FBAA  BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT TO LOWER 
CENTRE TO MIDDLE LEFT
🮫  U+1FBAB  BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT TO LOWER 
CENTRE TO MIDDLE RIGHT
🮬  U+1FBAC  BOX DRAWINGS LIGHT DIAGONAL MIDDLE LEFT TO UPPER CENTRE TO MIDDLE 
RIGHT TO LOWER CENTRE
🮭  U+1FBAD  BOX DRAWINGS LIGHT DIAGONAL MIDDLE RIGHT TO UPPER CENTRE TO MIDDLE 
LEFT TO LOWER CENTRE
🮮  U+1FBAE  BOX DRAWINGS LIGHT DIAGONAL DIAMOND
🮯  U+1FBAF  BOX DRAWINGS LIGHT HORIZONTAL WITH VERTICAL STROKE
🮰  U+1FBB0  ARROWHEAD-SHAPED POINTER
🮱  U+1FBB1  INVERSE CHECK MARK
🮲  U+1FBB2  LEFT HALF RUNNING MAN
🮳  U+1FBB3  RIGHT HALF RUNNING MAN
🮴  U+1FBB4  INVERSE DOWNWARDS ARROW WITH TIP LEFTWARDS
🮵  U+1FBB5  LEFTWARDS ARROW AND UPPER AND LOWER ONE EIGHTH BLOCK
🮶  U+1FBB6  RIGHTWARDS ARROW AND UPPER AND LOWER ONE EIGHTH BLOCK
🮷  U+1FBB7  DOWNWARDS ARROW AND RIGHT ONE EIGHTH BLOCK
🮸  U+1FBB8  UPWARDS ARROW AND RIGHT ONE EIGHTH BLOCK
🮹  U+1FBB9  LEFT HALF FOLDER
🮺  U+1FBBA  RIGHT HALF FOLDER
🮻  U+1FBBB  VOIDED GREEK CROSS
🮼  U+1FBBC  RIGHT OPEN SQUARED DOT
🮽  U+1FBBD  NEGATIVE DIAGONAL CROSS
🮾  U+1FBBE  NEGATIVE DIAGONAL MIDDLE RIGHT TO LOWER CENTRE
🮿  U+1FBBF  NEGATIVE DIAGONAL DIAMOND
🯀  U+1FBC0  WHITE HEAVY SALTIRE WITH ROUNDED CORNERS
🯁  U+1FBC1  LEFT THIRD WHITE RIGHT POINTING INDEX
🯂  U+1FBC2  MIDDLE THIRD WHITE RIGHT POINTING INDEX
🯃  U+1FBC3  RIGHT THIRD WHITE RIGHT POINTING INDEX
🯄  U+1FBC4  NEGATIVE SQUARED QUESTION MARK
🯅  U+1FBC5  STICK FIGURE
🯆  U+1FBC6  STICK FIGURE WITH ARMS RAISED
🯇  U+1FBC7  STICK FIGURE LEANING LEFT
🯈  U+1FBC8  STICK FIGURE LEANING RIGHT
🯉  U+1FBC9  STICK FIGURE WITH DRESS
🯊  U+1FBCA  WHITE UP-POINTING CHEVRON

🯰  U+1FBF0  SEGMENTED DIGIT ZERO
🯱  U+1FBF1  SEGMENTED DIGIT ONE
🯲  U+1FBF2  SEGMENTED DIGIT TWO
🯳  U+1FBF3  SEGMENTED DIGIT THREE
🯴  U+1FBF4  SEGMENTED DIGIT FOUR
🯵  U+1FBF5  SEGMENTED DIGIT FIVE
🯶  U+1FBF6  SEGMENTED DIGIT SIX
🯷  U+1FBF7  SEGMENTED DIGIT SEVEN
🯸  U+1FBF8  SEGMENTED DIGIT EIGHT
🯹  U+1FBF9  SEGMENTED DIGIT NINE



More information about the Cygwin mailing list