Bug 28524 - Conversion from ISO-2022-JP-3 with iconv may emit spurious NUL character on state reset
Summary: Conversion from ISO-2022-JP-3 with iconv may emit spurious NUL character on s...
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: 2.35
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-01 17:48 UTC by Nikita Popov
Modified: 2023-05-22 19:04 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Proof of concept (519 bytes, text/x-csrc)
2021-11-01 17:48 UTC, Nikita Popov
Details
Proposed patch (2.43 KB, patch)
2021-11-04 05:18 UTC, Nikita Popov
Details | Diff
Mistake (244 bytes, text/plain)
2023-05-22 19:01 UTC, Bruno Haible
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nikita Popov 2021-11-01 17:48:27 UTC
Created attachment 13751 [details]
Proof of concept

Hello, while investigating CVE-2021-3326 patch [here: https://sourceware.org/git/?p=glibc.git;a=commit;h=7d88c6142c6efc160c0ee5e4f85cde382c072888] I've found that it is possible to force iconv() to emit spurious NUL character by converting from ISO-2022-JP-3 encoding and resetting internal state (inbuf = NULL). This is possible because code sequence in iconvdata/iso-2022-jp-3.c which is responsible for resetting internal state doesn't ensure that any character has actually been saved. The main condition [data->__statep->__count != ASCII_set] also picks a cases where current set is different from ASCII_set. I've written small proof-of-concept which is attached.
In a research environment, proof-of-concept program was compiled with following command sequence:
$ gcc iconv-issue.c -c -Wall -o /tmp/iconv-issue.o
$ gcc -o /tmp/iconv-issue -nostdlib -nostartfiles /root/glibc-build/csu/Scrt1.o /root/glibc-build/csu/crti.o `gcc  --print-file-name=crtbeginS.o` /tmp/iconv-issue.o -Wl,-rpath-link=/root/glibc-build:/root/glibc-build/math:/root/glibc-build/elf:/root/glibc-build/dlfcn:/root/glibc-build/nss:/root/glibc-build/nis:/root/glibc-build/rt:/root/glibc-build/resolv:/root/glibc-build/mathvec:/root/glibc-build/support:/root/glibc-build/crypt:/root/glibc-build/nptl /root/glibc-build/libc.so.6 `gcc  --print-file-name=crtendS.o` /root/glibc-build/csu/crtn.o
$ env GCONV_PATH=/root/glibc-build/iconvdata LOCPATH=/root/glibc-build/localedata LC_ALL=C /root/glibc-build/elf/ld-linux-x86-64.so.2 --library-path /root/glibc-build:/root/glibc-build/math:/root/glibc-build/elf:/root/glibc-build/dlfcn:/root/glibc-build/nss:/root/glibc-build/nis:/root/glibc-build/rt:/root/glibc-build/resolv:/root/glibc-build/mathvec:/root/glibc-build/support:/root/glibc-build/crypt:/root/glibc-build/nptl /tmp/iconv-issue
Here /root/glibc-build is a build root for up-to-date version of glibc.

The last command results in a following output:
Step#1
 Input characters consumed = 3
Output characters produced = 0
Step#2
Output characters produced = 1

With glibc lacking CVE-2021-3326 fix output is normal:
Step#1
 Input characters consumed = 3
Output characters produced = 0
Step#2
Output characters produced = 0

I assess that this behavior may affect data integrity in certain use patterns of iconv where extra data ('\0') is added.
Such NUL character may confuse programs expecting NUL as the data terminator.
Comment 1 Nikita Popov 2021-11-04 05:18:51 UTC
Created attachment 13756 [details]
Proposed patch

Submitted proposed patch to address this issue.
Comment 2 Florian Weimer 2021-11-04 19:32:50 UTC
Fixed for glibc 2.35 via:

commit ff012870b2c02a62598c04daa1e54632e020fd7d
Author: Nikita Popov <npv1310@gmail.com>
Date:   Tue Nov 2 13:21:42 2021 +0500

    gconv: Do not emit spurious NUL character in ISO-2022-JP-3 (bug 28524)
    
    Bugfix 27256 has introduced another issue:
    In conversion from ISO-2022-JP-3 encoding, it is possible
    to force iconv to emit extra NUL character on internal state reset.
    To do this, it is sufficient to feed iconv with escape sequence
    which switches active character set.
    The simplified check 'data->__statep->__count != ASCII_set'
    introduced by the aforementioned bugfix picks that case and
    behaves as if '\0' character has been queued thus emitting it.
    
    To eliminate this issue, these steps are taken:
    * Restore original condition
    '(data->__statep->__count & ~7) != ASCII_set'.
    It is necessary since bits 0-2 may contain
    number of buffered input characters.
    * Check that queued character is not NUL.
    Similar step is taken for main conversion loop.
    
    Bundled test case follows following logic:
    * Try to convert ISO-2022-JP-3 escape sequence
    switching active character set
    * Reset internal state by providing NULL as input buffer
    * Ensure that nothing has been converted.
    
    Signed-off-by: Nikita Popov <npv1310@gmail.com>
Comment 3 Siddhesh Poyarekar 2021-11-08 16:30:10 UTC
AFAICT the bug cannot be invoked through user input and requires iconv to be invoked with NULL inbuf, which ought to require a separate application bug to do so unintentionally.  Hence there's no security impact to the bug.
Comment 4 Nikita Popov 2021-11-08 17:24:26 UTC
"the bug cannot be invoked through user input and requires iconv to be invoked with NULL inbuf"
I never claimed opposite. I mentioned "certain use patterns" where reset operation on iconv state should ensue. But, considering the importance of the GLIBC project, I believe the issue in question is worth fixing.
Comment 5 Siddhesh Poyarekar 2021-11-08 17:37:00 UTC
(In reply to Nikita Popov from comment #4)
> operation on iconv state should ensue. But, considering the importance of
> the GLIBC project, I believe the issue in question is worth fixing.

Agreed, and thank you for the fix!
Comment 6 Florian Weimer 2021-11-08 17:54:16 UTC
(In reply to Nikita Popov from comment #4)
> "the bug cannot be invoked through user input and requires iconv to be
> invoked with NULL inbuf"
> I never claimed opposite. I mentioned "certain use patterns" where reset
> operation on iconv state should ensue. But, considering the importance of
> the GLIBC project, I believe the issue in question is worth fixing.

Just to provide some context: We are trying to explain here why this isn't a *security* bug (it has been flagged as a security issue elsewhere, presumably by accident because it was a regression introduced by a security fix). Of course it's a bug, and thank you for reporting and fixing it!

Without concrete evidence of application impact, I think this bug is just glibc computing an incorrect result. Any bug could theoretically introduce an application vulnerability, but we have to draw a line somewhere because otherwise the distinction between security and non-security bugs becomes meaningless.

As far as I understand it, this issue can only occur if the input sequence does not return to the initial shift state, which is already partially corrupted.  Otherwise there isn't any work left to do for the do_flush case in iconv/skeleton.c, and the bug does not materialize at all.
Comment 7 Bruno Haible 2023-05-22 19:01:12 UTC
Created attachment 14901 [details]
Mistake
Comment 8 Bruno Haible 2023-05-22 19:04:31 UTC
Comment on attachment 14901 [details]
Mistake

Forget the added attachment; it does not belong here.