Bug 29605 - Regression in NSCD backend of getaddrinfo
Summary: Regression in NSCD backend of getaddrinfo
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: nscd (show other bugs)
Version: 2.36
: P2 critical
Target Milestone: 2.37
Assignee: Siddhesh Poyarekar
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-23 11:19 UTC by Jörg Sonnenberger
Modified: 2024-01-08 11:11 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-09-23 00:00:00


Attachments
Restore correct index. (302 bytes, patch)
2022-09-23 11:19 UTC, Jörg Sonnenberger
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jörg Sonnenberger 2022-09-23 11:19:58 UTC
Created attachment 14354 [details]
Restore correct index.

When using getaddrinfo with running nscd and address family hints, corrupted entries may be returned. In my case, "ssh -6 dual-stack-host" will normally see the A record first and the AAAAA record second. The returned entry from gai contains an address family of 0, which is nonsense.

Debugging points to getaddrinfo.c line 543 (refactored in e7e5315b7fa065a9c8bf525ca9a32f46fa4837e5). That use of count is highly suspicious and doesn't make sense to me. The attached patch shows what I mean, but I can't test it easily.
Comment 1 Siddhesh Poyarekar 2022-09-23 17:55:59 UTC
Thanks for catching that, it was indeed a typo and the fix looks correct.  I'll post it on list (with you as author) and get it into the main branch.  Can you please confirm that you're the author of that patch and own copyright to it?
Comment 2 Jörg Sonnenberger 2022-09-23 20:22:24 UTC
I'm the author and copyright holder of that patch. Thanks!
Comment 3 Holger Hoffstätte 2022-09-25 11:05:21 UTC
I'm not sure this fix is correct - or maybe it is and the real problem is somehwere else. I've been trying to figure out #29607 and with both this patch and the NULL check to strlen I still see messages like:

ping: unknown protocol family: 227

or

mtr: Packet type unsupported: Invalid argument

when trying to resolve multi-homed addresses.
Comment 4 Siddhesh Poyarekar 2022-09-26 13:35:54 UTC
Let me see if there are any other gotchas that I missed during the consolidation.
Comment 5 Siddhesh Poyarekar 2022-09-26 17:57:57 UTC
Holger, do you have a vm where I can login and debug this? I'm unable to reproduce this locally on Fedora with a built nscd.

Alternatively, please provide detailed steps for setup and reproduction so that I may try and replicate what you're doing.
Comment 6 Holger Hoffstätte 2022-09-26 18:13:40 UTC
(In reply to Siddhesh Poyarekar from comment #5)
> Holger, do you have a vm where I can login and debug this? I'm unable to
> reproduce this locally on Fedora with a built nscd.

No, I cannot provide a VM unfortunately. I was afraid this would happen as Gentoo's glibc has assorted patches post-release (see #29607 for a link) and I don't know why/how they were chosen; I'm just a (contributing) user. It might well be a screwup with those patches.

> Alternatively, please provide detailed steps for setup and reproduction so
> that I may try and replicate what you're doing.

I'm afraid other than trying with glibc master and enabling the hosts cache (setting "enable-cache hosts yes" in nscd.conf there is really not anything else I can suggest. I will report this back to the Gentoo glibc maintainers.
Best ignore me - thanks for trying!
Comment 7 Sam James 2022-09-26 18:15:24 UTC
(In reply to Holger Hoffstätte from comment #6)
> (In reply to Siddhesh Poyarekar from comment #5)
> > Holger, do you have a vm where I can login and debug this? I'm unable to
> > reproduce this locally on Fedora with a built nscd.
> 
> No, I cannot provide a VM unfortunately. I was afraid this would happen as
> Gentoo's glibc has assorted patches post-release (see #29607 for a link) and
> I don't know why/how they were chosen; I'm just a (contributing) user. It
> might well be a screwup with those patches.
> 

We don't really apply anything notable in Gentoo. You can try USE=vanilla if in doubt.
Comment 8 Holger Hoffstätte 2022-09-26 19:18:27 UTC
So my crashes/garbage data in packets indeed turned out to be caused by post-2.36 release patches from the backport branch, presumably the resolver rewrite. Building a completely vanilla 2.36 made everything work again, and nscd runs just fine with enabled host cache.
Cheers :)
Comment 9 Siddhesh Poyarekar 2022-09-26 19:21:54 UTC
(In reply to Holger Hoffstätte from comment #8)
> So my crashes/garbage data in packets indeed turned out to be caused by
> post-2.36 release patches from the backport branch, presumably the resolver
> rewrite. Building a completely vanilla 2.36 made everything work again, and
> nscd runs just fine with enabled host cache.
> Cheers :)

Thanks for confirming.  I'll close this one when I push Jörg's patch.  Perhaps you want to close out bug 29607 then.
Comment 10 Sourceware Commits 2022-09-28 16:47:44 UTC
The master branch has been updated by Siddhesh Poyarekar <siddhesh@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c9226c03da0276593a0918eaa9a14835183343e8

commit c9226c03da0276593a0918eaa9a14835183343e8
Author: Jörg Sonnenberger <joerg@bec.de>
Date:   Mon Sep 26 13:59:16 2022 -0400

    get_nscd_addresses: Fix subscript typos [BZ #29605]
    
    Fix the subscript on air->family, which was accidentally set to COUNT
    when it should have remained as I.
    
    Resolves: BZ #29605
    
    Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Comment 11 Siddhesh Poyarekar 2022-09-28 16:48:35 UTC
Fixed on mainline.
Comment 12 Sourceware Commits 2022-09-28 16:49:22 UTC
The release/2.36/master branch has been updated by Siddhesh Poyarekar <siddhesh@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=227c9035872fc9e9e2cf56ec8f89219747ee19bc

commit 227c9035872fc9e9e2cf56ec8f89219747ee19bc
Author: Jörg Sonnenberger <joerg@bec.de>
Date:   Mon Sep 26 13:59:16 2022 -0400

    get_nscd_addresses: Fix subscript typos [BZ #29605]
    
    Fix the subscript on air->family, which was accidentally set to COUNT
    when it should have remained as I.
    
    Resolves: BZ #29605
    
    Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
    (cherry picked from commit c9226c03da0276593a0918eaa9a14835183343e8)
Comment 13 Camila Camargo de Matos 2024-01-08 11:11:25 UTC
Hello,

When recently trying to patch CVE-2023-4806 in glibc for Ubuntu 22.04 LTS, the Ubuntu Security Team came across a possible regression in version 2.35 that seems to be related to this bug.

This is the link to the bug report containing more information on the issue that users came across in Ubuntu 22.04 LTS:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/2047155


When patching Ubuntu 22.04's version of glibc (2.35) for CVE-2023-4806 (and CVEs CVE-2023-4813 and CVE-2023-5156), several of the refactoring commits in branch release/2.35/master were added as well in order to avoid any possible issues and simplify the application of the CVE patch (these refactoring commits are the ones added to sysdeps/posix/getaddrinfo.c in 2023-09). In this group of commits was commit ce64e72b, which is cherry-picked from e7e5315b, mentioned here as the cause of the issue in nscd, consequence of a typo in the refactoring.

Analysis of the release/2.35/master branch seems to indicate that the fix to this typo was not applied to glibc 2.35, and the report in the Ubuntu Launchpad bug shows version 2.35 of glibc (more specifically, nscd) being affected by a regression when previously mentioned refactoring commits are added.

A new version of the Ubuntu 22.04 glibc package will be released and this new version contains the fix provided in this sourceware bug (commit 227c9035) as well as three other refactoring commits (backported from the release/2.36/master branch as well. These are: bc0d18d8, 06890c7b and d3f2c2c8). Adding these additional changes to the 22.04 glibc 2.35 package seem to have resolved the issue being reported in the Ubuntu Launchpad bug.

I mention this here in case 2.35 is still being supported, so that the fix to this issue can be included in that branch as well.

Regards,
Camila Camargo de Matos.