24973 – (CVE-2019-25013) iconv encounters segmentation fault when converting 0x00 0xfe in EUC-KR to UTF-8 (CVE-2019-25013)

Bug 24973 (CVE-2019-25013) - iconv encounters segmentation fault when converting 0x00 0xfe in EUC-KR to UTF-8 (CVE-2019-25013)

Summary: iconv encounters segmentation fault when converting 0x00 0xfe in EUC-KR to UT...

Status:	RESOLVED FIXED

Alias:	CVE-2019-25013

Product:	glibc
Classification:	Unclassified
Component:	locale (show other bugs)
Version:	2.30

Importance:	P2 normal
Target Milestone:	2.33
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2019-09-06 10:47 UTC by Arjun Shankar
Modified:	2021-10-01 02:03 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Flags:	fweimer: security+

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Arjun Shankar 2019-09-06 10:47:28 UTC

The following equivalent iconv invocations lead to a SIGSEGV:

$ echo -en "\x00\xfe" | iconv -f EUC-KR -t "UTF-8//IGNORE"

$ echo -en "\x00\xfe" | iconv -c -f EUC-KR -t "UTF-8"

Comment 1 Siddhesh Poyarekar 2020-12-21 03:37:02 UTC

Fixed in master:

https://sourceware.org/git/?p=glibc.git;a=commit;h=ee7a3144c9922808181009b7b3e50e852fb4999b

Author: Andreas Schwab <schwab@suse.de>
Date:   Mon Dec 21 08:56:43 2020 +0530

    Fix buffer overrun in EUC-KR conversion module (bz #24973)
    
    The byte 0xfe as input to the EUC-KR conversion denotes a user-defined
    area and is not allowed.  The from_euc_kr function used to skip two bytes
    when told to skip over the unknown designation, potentially running over
    the buffer end.

Comment 2 soko246 2021-09-30 17:45:15 UTC

Using iconv results in corrupted output, when "-c" flag is used for input where characters that *can* and *cannot* be converted appear together.
The issue only manifests for rather large inputs (presumably > 32K).

Run in bash:
>export LANG=C
>perl -E 'say "\x58\xe2\x58\xc3\x92\x58\xe2\x58\x58\xe2\x58\xc3\x92\x58\xe2\x58\n" x 15000' | iconv -c -f ISO-8859-3 -t UTF-8 | sort | uniq -c

Expected output:
>15000 XâX�XâXXâX�XâX

Actual output:
> 1
> 2 XXâX�XâX
> 2 XâX�XXâX
> 2 XâX�XâX
> 1 XâX�XâXX
> 2 XâX�XâXXâX�X�XâXXâX�XâX
> 14917 XâX�XâXXâX�XâX

As can be seen, many lines just disappear (14917+2+1+2+2+2+1 don't sum up to 15000). 

Actual specific input does not matter, as long as it has a mix of convertable and non-convertable characters.
Reducing number of input lines to smaller number (ex. 1000) and all works as expected:
>1000 XâX�XâXXâX�XâX

I tried this for ISO-8859-3 and ISO-8859-8 (same input) with similar (wrong) results.

Using piconv (Perl variant of iconv) instead of iconv produces correct results.

Comment 3 Siddhesh Poyarekar 2021-10-01 02:03:48 UTC

Please file a separate bug for it.