readdir() returns inaccessible name if file was created with invalid UTF-8

Andrew Schulman andrex.e.schulman@gmail.com
Sat Jun 28 16:13:13 GMT 2025


>> Testcase (attached):
>
>Thanks for the testcase!
>
>I found the problem in the newlib core function creating wchar_t from
>UTF-8 input.  In case of 4 byte UTF-8 sequences, the code created the
>low surrogate already after reading byte 3, without checking if byte 4
>of the UTF-8 sequence is a valid byte. Hilarity ensues.
>
>Fortunately this bug has only been introduced very recently, to wit, on
>2009-03-24, a mere 16 years ago.  And it is my bug and mine alone :}
>
>I'm just prep'ing a fix which I'll push in a minute or two.

Gold star awarded! https://cygwin.com/goldstars/#CV


More information about the Cygwin mailing list