[PATCH 1/3]: C++20 P0482R6 and C2X N2653: Fix for bug 25744, mbrtowc with Big5-HKSCS

Tom Honermann tom@honermann.net
Mon Jun 7 02:07:55 GMT 2021


This patch for bug 25744 [1] updates the Big5-HKSCS converter to 
properly maintain the lowest 3 bits of the mbstate_t __count data 
member.  This change is necessary to ensure that state is correctly 
preserved when the converter encounters an incomplete multibyte 
character.  More details are available in bug 25744 [1].

The code changes are styled to match how these bits are maintained by 
converters such as iso-2022-jp.c, ibm930.c, and others.

Running 'grep __count' in the 'iconvdata' directory suggests that a 
number of other converters, euc-jisx0213.c for example, also fail to 
preserve these bits in some cases, though it may be that negative 
effects are not observed for those converters.  This patch does not 
attempt to address such issues with other converters.

This fix was previously posted to this mailing list on April 7th, 2020 
[2], but was not followed up on.

Tested on Linux x86_64.

Tom.

[1]: Bug 25744
      "mbrtowc with Big5-HKSCS returns 2 instead of 1 when consuming the
      second byte of certain double byte characters"
      https://sourceware.org/bugzilla/show_bug.cgi?id=25744

[2]: "[PATCH] Correct the Big5-HKSCS converter to preserve low order
      state bits (bug 25744)"
      https://sourceware.org/pipermail/libc-alpha/2020-April/112595.html

-------------- next part --------------
A non-text attachment was scrubbed...
Name: n2653-1.patch
Type: text/x-patch
Size: 5667 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/libc-alpha/attachments/20210606/a05b5c47/attachment.bin>


More information about the Libc-alpha mailing list