Bug 21092 - mbsnrtowcs: *src is not left pointing to the invalid multibyte sequence when dest is NULL
Summary: mbsnrtowcs: *src is not left pointing to the invalid multibyte sequence when ...
Status: UNCONFIRMED
Alias: None
Product: glibc
Classification: Unclassified
Component: locale (show other bugs)
Version: 2.24
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-30 01:53 UTC by Igor Liferenko
Modified: 2018-01-24 16:28 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Igor Liferenko 2017-01-30 01:53:02 UTC
The following examples demonstrate the difference when dest is NULL and dest
is not NULL.

According to this bugreport[1], there are two possible cases:

1) incomplete multibyte character is at the end of input buffer *)
2) incomplete multibyte character is not at the end of input buffer *)

*) - end of buffer is determined by nms argument.

The examples 1.1 and 2.1 demonstrate that when dest is NULL, *src is
not changed.
The examples 1.2 and 2.2 demonstrate that when dest is not NULL, *src is
correctly changed.

In the examples the following UTF-8 sequences are used:

\320     = incomplete
\321\215 = U+044D (CYRILLIC SMALL LETTER E)
\321\216 = U+044E (CYRILLIC SMALL LETTER YU)

Example 1.1 (at the end of buffer, dest is NULL):

    #include <locale.h>
    #include <wchar.h>
    #include <stdio.h>
    int main(void)
    {
      setlocale(LC_CTYPE, "en_US.UTF-8");
      char *s = "\321\216\320";
      const char *x = s;
      printf("status: %d\n", mbsnrtowcs(NULL,&x,3,0,NULL));
      perror(NULL);
      printf("ori=%p\nnew=%p\n",(void *)s,(void *)x);
      return 0;
    }

Output 1.1:

    status: 1
    Success
    ori=0x56337c86d910
    new=0x56337c86d910


Example 2.1 (not at the end of buffer, dest is NULL):

    #include <locale.h>
    #include <wchar.h>
    #include <stdio.h>
    int main(void)
    {
      setlocale(LC_CTYPE, "en_US.UTF-8");
      char *s = "\321\216\320\321\215";
      const char *x = s;
      printf("status: %d\n", mbsnrtowcs(NULL,&x,5,0,NULL));
      perror(NULL);
      printf("ori=%p\nnew=%p\n",(void *)s,(void *)x);
      return 0;
    }

Output 2.1:

    status: -1
    Invalid or incomplete multibyte or wide character
    ori=0x55ad82792910
    new=0x55ad82792910

Example 1.2 (at the end of buffer, dest is not NULL):

    #include <locale.h>
    #include <wchar.h>
    #include <stdio.h>
    int main(void)
    {
      setlocale(LC_CTYPE, "en_US.UTF-8");
      char *s = "\321\216\320";
      const char *x = s;
      wchar_t wcs[3];
      printf("status: %d\n", mbsnrtowcs(wcs,&x,3,3,NULL));
      perror(NULL);
      printf("ori=%p\nnew=%p\n",(void *)s,(void *)x);
      return 0;
    }

Output 1.2:

    status: 1
    Success
    ori=0x556497c29980
    new=0x556497c29983


Example 2.2 (not at the end of buffer, dest is not NULL):

    #include <locale.h>
    #include <wchar.h>
    #include <stdio.h>
    int main(void)
    {
      setlocale(LC_CTYPE, "en_US.UTF-8");
      char *s = "\321\216\320\321\215";
      const char *x = s;
      wchar_t wcs[5];
      printf("status: %d\n", mbsnrtowcs(wcs,&x,5,5,NULL));
      perror(NULL);
      printf("ori=%p\nnew=%p\n",(void *)s,(void *)x);
      return 0;
    }

Output 2.2:

    status: -1
    Invalid or incomplete multibyte or wide character
    ori=0x55bb1aa98980
    new=0x55bb1aa98982
Comment 1 Igor Liferenko 2017-01-30 01:54:16 UTC
[1]: https://sourceware.org/bugzilla/show_bug.cgi?id=20860
Comment 2 Igor Liferenko 2017-02-17 07:55:19 UTC
It may not be a bug that mbsnrtowcs() does not change *src when dest is NULL.
But to be sure, it is necessary to find out whether this is described in the standard...