Sources Bugzilla – Full Text Bug Listing
|Summary:||assertion error in res_query.c|
|Product:||glibc||Reporter:||Aurelien Jarno <aurelien>|
|Component:||network||Assignee:||Ulrich Drepper <drepper.fsp>|
|Severity:||normal||CC:||allan, bugzilla, davem, law, mmarek, nick.jones, pluto, timothy.c.pepper, toolchain|
Patch to fix the issue
testcase for this bug
Description Aurelien Jarno 2011-07-21 16:46:14 UTC
Created attachment 5855 [details] Patch to fix the issue Commit 4769ae77fc6c8dacea6476addb015c8797848cdd a regression in the resolver code, which trigger an assert in some conditions: firefox-bin: res_query.c:251: __libc_res_nquery: Assertion `hp != hp2' failed. Aborting. When the first answer is a SERVFAIL, NOTIMP or REFUSED, resplen now got assigned 0, while recvresp1 or recvresp2 is set to 1: /* No data from the first reply. */ resplen = 0; When the second answer arrives, its buffer is allocated at *ansp + resplen, which means in that case *ansp and *ansp2 are equals: *anssizp2 = orig_anssizp - resplen; *ansp2 = *ansp + resplen; Given a second answer has still be provided, hp2 got assigned *answerp2, which is the same than *answer (see above), so hp == hp2. HEADER *hp2 = answerp2 ? (HEADER *) *answerp2 : hp; This is enough to trigger the assertion, that is the checks on the answer buffers doesn't match the checks on the response lengths. One way to fix that is to rewrite this part of the code to do all the checks on the response lenghts. This is what the attached patch does.
Comment 1 Fernando Herrera 2011-12-19 03:22:03 UTC
Created attachment 6118 [details] testcase for this bug This testcase triggers the bug using these nameservers: nameserver 188.8.131.52 nameserver 184.108.40.206
Comment 2 Ulrich Drepper 2011-12-22 00:04:28 UTC
(In reply to comment #1) > Created attachment 6118 [details] > testcase for this bug > > This testcase triggers the bug using these nameservers: > > nameserver 220.127.116.11 > nameserver 18.104.22.168 No, it doesn't. I don't see any assert.
Comment 3 bugzilla 2012-05-17 18:38:26 UTC
I see this too (as does everyone else who uses a non-patched glibc). Hope to see this fixed in 2.15.1
Comment 4 Tim Pepper 2012-10-09 21:33:47 UTC
I was lazy and edited an scp command in my shell history to be an ssh command, but forgot to remove the ':', eg: scp file host:/big/long/path ssh host: command /big/long/path That ssh command resulted in the assert noted here, but only on a first try. I'm guessing the query/result is cached and traverses different code subsequently and gives a more expected output like: ssh: Could not resolve hostname host:: Name or service not known
Comment 5 law 2012-10-10 00:31:40 UTC
Tim, what are the contents of your resolv.conf? This issue is highly dependent on the nameservers you use and unfortunately nobody's been able to trigger on public nameservers. I worked with Fernando on analysis in the past, but wasn't able to wrap up before having to switch to another issue.
Comment 6 Tim Pepper 2012-10-10 17:28:47 UTC
(In reply to comment #5) > Tim, what are the contents of your resolv.conf? I'm using connman, so my /etc/resolv.conf has simply: # Generated by Connection Manager nameserver 127.0.0.1 connman's then using whatever company internal dns server(s) the local corporate dhcp server told it to use... It's quite easy for me to reproduce, with the exception of not having figured out what's being seemingly cached or where. If there's anything you'd suggest for added instrumentation in glibc, to look for in the corefile, look for in a tcpdump, etc., let me know and I will dig a layer deeper.
Comment 7 law 2012-10-11 18:35:48 UTC
Unfortunately, we're obviously not going to be able to access the DNS server running on your local host. It's certainly relatively easy to reproduce once you've got a nameserver which triggers the problem -- but that's been the incredibly frustrating problem here. Every such name server has been behind a firewall. Given there's a potential patch attached to this BZ, what's really needed is for someone with better knowledge of this code to review that patch. I don't feel qualified to do that review.
Comment 8 David S. Miller 2012-11-30 20:05:56 UTC
Fixed in glibc-2.17