Bug 13518 - iconv program doesn't handle //IGNORE flag correctly
Summary: iconv program doesn't handle //IGNORE flag correctly
Status: RESOLVED INVALID
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: 2.14
: P2 normal
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-18 22:33 UTC by Edward Z. Yang
Modified: 2014-06-27 11:27 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Alpha, followed by a lot of x's (39 bytes, text/plain)
2011-12-18 22:55 UTC, Edward Z. Yang
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Edward Z. Yang 2011-12-18 22:33:59 UTC
iconv seems to truncate inputs at around 8157 bytes if they contain invalid characters for the target set, even if IGNORE is specified.

Steps to reproduce:
1. Download iconv.html
ezyang@javelin:~$ wget http://www.oppcharts.com/iconv.html
2. Attempt to convert UTF-8 to iso-8859-1//IGNORE

Expected behavior (from libiconv-1.14):
ezyang@javelin:~/Dev/glibc/build$ ~/Desktop/libiconv-1.14/src/iconv_no_i18n -f utf-8 -t iso-8859-1//IGNORE ~/iconv.html | wc -c
15312

Actual behavior (from latest Git glibc-2.14-567-ga4647e7):
ezyang@javelin:~/Dev/glibc/build$ ./testrun.sh iconv/iconv_prog -f utf-8 -t iso-8859-1//IGNORE ~/iconv.html | wc -c
iconv/iconv_prog: illegal input sequence at position 8168
8157
Comment 1 Edward Z. Yang 2011-12-18 22:55:50 UTC
Created attachment 6117 [details]
Alpha, followed by a lot of x's

Here's a better, more minimal test-case.

ezyang@javelin:~$ Dev/glibc/build/testrun.sh Dev/glibc/build/iconv/iconv_prog -f utf-8 -t ascii//IGNORE < test.txt | wc -c
Dev/glibc/build/iconv/iconv_prog: illegal input sequence at position 8161
8159

ezyang@javelin:~$ Desktop/libiconv-1.14/src/iconv_no_i18n -f utf-8 -t ascii//IGNORE < test.txt | wc -c
11059
Comment 2 Ulrich Drepper 2011-12-22 23:42:30 UTC
The iconv program cannot be used with the magic //IGNORE suffix.  You have to use the -c parameter.
Comment 3 Edward Z. Yang 2011-12-23 00:43:16 UTC
I think there still is a bug here. If //IGNORE is not supported by iconv_prog, the behavior between -t with IGNORE and -c should be the same. However, this is not the case:

ezyang@javelin:~$ Dev/glibc/build/testrun.sh Dev/glibc/build/iconv/iconv_prog -f utf-8 -t ascii//IGNORE < test.txt | wc -c
Dev/glibc/build/iconv/iconv_prog: illegal input sequence at position 8161
8159

ezyang@javelin:~$ Dev/glibc/build/testrun.sh Dev/glibc/build/iconv/iconv_prog -f utf-8 -t ascii < test.txt | wc -c
Dev/glibc/build/iconv/iconv_prog: illegal input sequence at position 0
0

For reference, here is iconv running with an invalid extra flag:

ezyang@javelin:~$ Dev/glibc/build/testrun.sh Dev/glibc/build/iconv/iconv_prog -f utf-8 -t ascii//FOOBAR < test.txt | wc -c
Dev/glibc/build/iconv/iconv_prog: illegal input sequence at position 0
0
Comment 4 Edward Z. Yang 2011-12-23 01:02:05 UTC
OK, I think I understand the underlying issue better. I'll file a new bug.