Created attachment 6358 [details] Test case showing the bug Per POSIX: "If iconv() encounters a character in the input buffer that is valid, but for which an identical character does not exist in the target codeset, iconv() shall perform an implementation-defined conversion on this character." And: "The iconv() function shall update the variables pointed to by the arguments to reflect the extent of the conversion and return the number of non-identical conversions performed. If the entire string in the input buffer is converted, the value pointed to by inbytesleft shall be 0. If the input conversion is stopped due to any conditions mentioned above, the value pointed to by inbytesleft shall be non-zero and errno shall be set to indicate the condition. If an error occurs, iconv() shall return (size_t)-1 and set errno to indicate the error." However, glibc's iconv is buggy and returns (size_t)-1 when a character from the input character set does not exist in the output character set. I am attaching a simple test program that shows the issue, based on incorrect test code I found in glib: https://bugzilla.gnome.org/show_bug.cgi?id=674540
This is indeed highly annoying. At least, the documentation should be updated to mention what error code is used to signal that a sequence that cannot be represented in the target encoding has been encountered (this is EILSEQ). It should also mention that glibc does not comply with POSIX on that point. The documentation is also not clear when it says "If all input from the input buffer is successfully converted and stored in the output buffer, the function returns the number of non-reversible conversions performed." [1] Indeed, since it is said that sequences that cannot be represented in the target encoding trigger an error, they won't affect the return code. FWIX, POSIX says "non-identical conversions" instead of "non-reversible". Finally, the part saying "future versions will provide better ones, but they are not yet finished"[1] could also be removed, as I guess backward-compatibility will be preserved, won't it? 1: http://www.gnu.org/software/libc/manual/html_node/Generic-Conversion-Interface.html