Created attachment 5855 [details]
Patch to fix the issue
Commit 4769ae77fc6c8dacea6476addb015c8797848cdd a regression in the resolver code, which trigger an assert in some conditions:
firefox-bin: res_query.c:251: __libc_res_nquery: Assertion `hp != hp2' failed. Aborting.
When the first answer is a SERVFAIL, NOTIMP or REFUSED, resplen now got assigned 0, while recvresp1 or recvresp2 is set to 1:
/* No data from the first reply. */
resplen = 0;
When the second answer arrives, its buffer is allocated at *ansp + resplen, which means in that case *ansp and *ansp2 are equals:
*anssizp2 = orig_anssizp - resplen;
*ansp2 = *ansp + resplen;
Given a second answer has still be provided, hp2 got assigned *answerp2, which is the same than *answer (see above), so hp == hp2.
HEADER *hp2 = answerp2 ? (HEADER *) *answerp2 : hp;
This is enough to trigger the assertion, that is the checks on the answer buffers doesn't match the checks on the response lengths.
One way to fix that is to rewrite this part of the code to do all the checks on the response lenghts. This is what the attached patch does.
Created attachment 6118 [details]
testcase for this bug
This testcase triggers the bug using these nameservers:
(In reply to comment #1)
> Created attachment 6118 [details]
> testcase for this bug
> This testcase triggers the bug using these nameservers:
> nameserver 22.214.171.124
> nameserver 126.96.36.199
No, it doesn't. I don't see any assert.
I see this too (as does everyone else who uses a non-patched glibc). Hope to see this fixed in 2.15.1
I was lazy and edited an scp command in my shell history to be an ssh command, but forgot to remove the ':', eg:
scp file host:/big/long/path
ssh host: command /big/long/path
That ssh command resulted in the assert noted here, but only on a first try. I'm guessing the query/result is cached and traverses different code subsequently and gives a more expected output like:
ssh: Could not resolve hostname host:: Name or service not known
Tim, what are the contents of your resolv.conf?
This issue is highly dependent on the nameservers you use and unfortunately nobody's been able to trigger on public nameservers. I worked with Fernando on analysis in the past, but wasn't able to wrap up before having to switch to another issue.
(In reply to comment #5)
> Tim, what are the contents of your resolv.conf?
I'm using connman, so my /etc/resolv.conf has simply:
# Generated by Connection Manager
connman's then using whatever company internal dns server(s) the local corporate dhcp server told it to use...
It's quite easy for me to reproduce, with the exception of not having figured out what's being seemingly cached or where. If there's anything you'd suggest for added instrumentation in glibc, to look for in the corefile, look for in a tcpdump, etc., let me know and I will dig a layer deeper.
Unfortunately, we're obviously not going to be able to access the DNS server running on your local host.
It's certainly relatively easy to reproduce once you've got a nameserver which triggers the problem -- but that's been the incredibly frustrating problem here. Every such name server has been behind a firewall.
Given there's a potential patch attached to this BZ, what's really needed is for someone with better knowledge of this code to review that patch. I don't feel qualified to do that review.
Fixed in glibc-2.17
*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.
I think this can be triggered by (untrusted) authoritative servers with some resolvers, so it this is a denial-of-service vulnerability.
Commit 4769ae77fc6c8dacea6476addb015c8797848cdd (see comment #1) went into glibc 2.14, so versions from 2.14 to 2.16 are affected by this.
Just for reference since I did not see it in this bug, the commit was done as: