Bug 20568 - Segfault with wide characters and setlocale/fgetwc/UTF-8
Summary: Segfault with wide characters and setlocale/fgetwc/UTF-8
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: locale (show other bugs)
Version: 2.24
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-07 20:09 UTC by Tobias Stoeckmann
Modified: 2018-12-20 07:04 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2016-09-28 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Stoeckmann 2016-09-07 20:09:49 UTC
I have spotted a bug which looks rather obscure to me. Please see this C code as a minimal way to reproduce this issue:

---
#include <locale.h>
#include <stdio.h>
#include <wchar.h>

int
main(void)
{
        setlocale(LC_ALL, "");
        fgetwc(stdin);
        return 0;
}
---

$ gcc -o poc poc.c
$ python -c 'print 13*"\t"' | LC_CTYPE=en_US.UTF-8 ./poc
Segmentation fault
$ python -c 'print 13*"\t"' | LC_CTYPE=POSIX ./poc
$ _

It means that I have to enter around 13 tabulator characters to trigger the issue, but it won't hurt to add a few more. I was able to reproduce this on other distributions with glibc 2.24, so I don't think that it's specific to one of them.

Also, this issue only happens with an LC_CTYPE of an UTF-8 locale. I have tested en_US and de_DE, which both trigger this issue. With POSIX or C, the segmentation fault is not triggered.

I hope this helps you to track down this bug, as I was unable to figure out the flush mechanisms in glibc in a reasonable time. :)


The stack trace on my system with glibc 2.24 looks like this:

(gdb) bt
#0  __GI__IO_wfile_sync (fp=0xb77295a0 <_IO_2_1_stdin_>) at wfileops.c:534
#1  0xb75e2bc6 in _IO_default_setbuf (fp=0xb77295a0 <_IO_2_1_stdin_>, p=0x0, len=0) at genops.c:523
#2  0xb75df2e2 in _IO_new_file_setbuf (fp=0xb77295a0 <_IO_2_1_stdin_>, p=0x0, len=0) at fileops.c:459
#3  0xb75e3516 in _IO_unbuffer_all () at genops.c:921
#4  _IO_cleanup () at genops.c:966
#5  0xb75a5632 in __run_exit_handlers (status=0, listp=0xb77293dc <__exit_funcs>, run_list_atexit=true, run_dtors=true) at exit.c:96
#6  0xb75a56f1 in __GI_exit (status=0) at exit.c:105
#7  0xb758f1b2 in __libc_start_main (main=0x804846b <main>, argc=1, argv=0xbfef4004, init=0x80484b0 <__libc_csu_init>, fini=0x8048510 <__libc_csu_fini>, 
    rtld_fini=0xb774d7a0 <_dl_fini>, stack_end=0xbfef3ffc) at ../csu/libc-start.c:323
#8  0x08048391 in _start () at ../sysdeps/i386/start.S:115
Comment 1 Florian Weimer 2016-09-28 16:44:12 UTC
This starts happening with tis commit:

commit 18d26750dd8fd328a78cf639fd0ec2494680a2a4
Author: Paul Pluzhnikov <ppluzhnikov@google.com>
Date:   Sun Mar 8 09:46:53 2015 -0700

    Cleanup: in preparation for fixing BZ #16734, fix memory leaks exposed by
    switching fopen()ed streams from mmap to malloc.
Comment 2 Florian Weimer 2016-09-28 16:59:05 UTC
Related discussion: https://lists.debian.org/debian-glibc/2016/09/msg00173.html
Comment 3 Fiodor 2018-12-15 18:48:58 UTC
Confirm this bug.

cat 1.c && gcc 1.c && ./a.out
#include <locale.h>
#include <wchar.h>
#include <stdio.h>

int main()
{
    setlocale(LC_ALL, "ru_RU.UTF-8");
    getwc(stdin);
    return 0;
}
11111111111111111111
*** stack smashing detected ***: <unknown> terminated
Аварийный останов (стек памяти сброшен на диск)
[faust@archlinux РАзная всячина]$ cat 1.c && clang 1.c && ./a.out
#include <locale.h>
#include <wchar.h>
#include <stdio.h>

int main()
{
    setlocale(LC_ALL, "ru_RU.UTF-8");
    getwc(stdin);
    return 0;
}
222222222222222222
*** stack smashing detected ***: <unknown> terminated
Аварийный останов (стек памяти сброшен на диск)
[faust@archlinux РАзная всячина]$ gdb ./a.out
GNU gdb (GDB) 8.2
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./a.out...(no debugging symbols found)...done.
(gdb) r
Starting program: /home/faust/Проекты/C/РАзная всячина/a.out 
22222222222222222222222
*** stack smashing detected ***: <unknown> terminated

Program received signal SIGABRT, Aborted.
0x00007ffff7de7d7f in raise () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff7de7d7f in raise () from /usr/lib/libc.so.6
#1  0x00007ffff7dd2672 in abort () from /usr/lib/libc.so.6
#2  0x00007ffff7e2a878 in __libc_message () from /usr/lib/libc.so.6
#3  0x00007ffff7ebd415 in __fortify_fail_abort () from /usr/lib/libc.so.6
#4  0x00007ffff7ebd3c6 in __stack_chk_fail () from /usr/lib/libc.so.6
#5  0x00007ffff7e282dc in do_length () from /usr/lib/libc.so.6
#6  0x00007ffff7e27ca5 in _IO_wfile_sync () from /usr/lib/libc.so.6
#7  0x00007ffff7e2ef26 in _IO_default_setbuf () from /usr/lib/libc.so.6
#8  0x00007ffff7e2babe in __GI__IO_file_setbuf () from /usr/lib/libc.so.6
#9  0x00007ffff7e2f9a1 in _IO_cleanup () from /usr/lib/libc.so.6
#10 0x00007ffff7dea552 in __run_exit_handlers () from /usr/lib/libc.so.6
#11 0x00007ffff7dea58e in exit () from /usr/lib/libc.so.6
#12 0x00007ffff7dd422a in __libc_start_main () from /usr/lib/libc.so.6
#13 0x000055555555507e in _start ()
(gdb) q
A debugging session is active.

        Inferior 1 [process 2703] will be killed.

Quit anyway? (y or n) y
[faust@archlinux РАзная всячина]$ /lib64/libc.so.6 -v
GNU C Library (GNU libc) stable release version 2.28.
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 8.2.1 20180831.
libc ABIs: UNIQUE IFUNC ABSOLUTE
For bug reporting instructions, please see:
<https://bugs.archlinux.org/>.
[faust@archlinux РАзная всячина]$ uname -ar
Linux archlinux 4.19.8-arch1-1-ARCH #1 SMP PREEMPT Sat Dec 8 13:49:11 UTC 2018 x86_64 GNU/Linux
[faust@archlinux РАзная всячина]$
Comment 4 Igor Liferenko 2018-12-20 06:18:58 UTC
Hi,

I have just been beaten by this issue.

Tested on version 2.11.2 - this bug is not there.
The next earliest version that I tested on is 2.24 - bug is there.

The bug starts to show when 9 characters are input.

This bug does not show if "setlocale" is commented or if "fclose(stdin);" is
added before "return 0".

Hope this helps.

Regards,
Igor
Comment 5 Igor Liferenko 2018-12-20 07:04:01 UTC
I have done additional testing for number of input bytes.
Here is the report, where the range is number of input bytes and
text after '=' is the result of executing the following command:

    printf '%0.s1' $(seq N) | ./a.out

where N is the desired number of input bytes.

0-9 = terminated normally

10-61 = *** stack smashing detected ***: <unknown> terminated Aborted

62-105 = Segmentation fault

106-121 = *** stack smashing detected ***: <unknown> terminated Aborted

122-137 = Segmentation fault

138-??? = terminated normally

etc...