Bug 13013 - assertion error in res_query.c
Summary: assertion error in res_query.c
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: network (show other bugs)
Version: 2.16
: P2 normal
Target Milestone: 2.17
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-21 16:46 UTC by Aurelien Jarno
Modified: 2015-05-14 23:51 UTC (History)
11 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security+


Attachments
Patch to fix the issue (762 bytes, patch)
2011-07-21 16:46 UTC, Aurelien Jarno
Details | Diff
testcase for this bug (217 bytes, text/plain)
2011-12-19 03:22 UTC, Fernando Herrera
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aurelien Jarno 2011-07-21 16:46:14 UTC
Created attachment 5855 [details]
Patch to fix the issue

Commit 4769ae77fc6c8dacea6476addb015c8797848cdd a regression in the resolver code, which trigger an assert in some conditions:

firefox-bin: res_query.c:251: __libc_res_nquery: Assertion `hp != hp2' failed. Aborting.

When the first answer is a SERVFAIL, NOTIMP or REFUSED, resplen now got assigned 0, while recvresp1 or recvresp2 is set to 1:

  /* No data from the first reply.  */
  resplen = 0;

When the second answer arrives, its buffer is allocated at *ansp + resplen, which means in that case *ansp and *ansp2 are equals:

  *anssizp2 = orig_anssizp - resplen;
  *ansp2 = *ansp + resplen;

Given a second answer has still be provided, hp2 got assigned *answerp2, which is the same than *answer (see above), so hp == hp2.

  HEADER *hp2 = answerp2 ? (HEADER *) *answerp2 : hp;

This is enough to trigger the assertion, that is the checks on the answer buffers doesn't match the checks on the response lengths.

One way to fix that is to rewrite this part of the code to do all the checks on the response lenghts. This is what the attached patch does.
Comment 1 Fernando Herrera 2011-12-19 03:22:03 UTC
Created attachment 6118 [details]
testcase for this bug

This testcase triggers the bug using these nameservers:

nameserver 87.216.1.65
nameserver 87.216.1.66
Comment 2 Ulrich Drepper 2011-12-22 00:04:28 UTC
(In reply to comment #1)
> Created attachment 6118 [details]
> testcase for this bug
> 
> This testcase triggers the bug using these nameservers:
> 
> nameserver 87.216.1.65
> nameserver 87.216.1.66

No, it doesn't.  I don't see any assert.
Comment 3 bugzilla 2012-05-17 18:38:26 UTC
I see this too (as does everyone else who uses a non-patched glibc). Hope to see this fixed in 2.15.1
Comment 4 Tim Pepper 2012-10-09 21:33:47 UTC
I was lazy and edited an scp command in my shell history to be an ssh command, but forgot to remove the ':', eg:

   scp file host:/big/long/path
   ssh host: command /big/long/path

That ssh command resulted in the assert noted here, but only on a first try.  I'm guessing the query/result is cached and traverses different code subsequently and gives a more expected output like:

   ssh: Could not resolve hostname host:: Name or service not known
Comment 5 law 2012-10-10 00:31:40 UTC
Tim, what are the contents of your resolv.conf?  

This issue is highly dependent on the nameservers you use and unfortunately nobody's been able to trigger on public nameservers.  I worked with  Fernando on analysis in the past, but wasn't able to wrap up before having to switch to another issue.
Comment 6 Tim Pepper 2012-10-10 17:28:47 UTC
(In reply to comment #5)
> Tim, what are the contents of your resolv.conf?  

I'm using connman, so my /etc/resolv.conf has simply:
    # Generated by Connection Manager
    nameserver 127.0.0.1

connman's then using whatever company internal dns server(s) the local corporate dhcp server told it to use...

It's quite easy for me to reproduce, with the exception of not having figured out what's being seemingly cached or where.  If there's anything you'd suggest for added instrumentation in glibc, to look for in the corefile, look for in a tcpdump, etc., let me know and I will dig a layer deeper.
Comment 7 law 2012-10-11 18:35:48 UTC
Unfortunately, we're obviously not going to be able to access the DNS server running on your local host.

It's certainly relatively easy to reproduce once you've got a nameserver which triggers the problem -- but that's been the incredibly frustrating problem here.  Every such name server has been behind a firewall.

Given there's a potential patch attached to this BZ, what's really needed is for someone with better knowledge of this code to review that patch.  I don't feel qualified to do that review.
Comment 8 David S. Miller 2012-11-30 20:05:56 UTC
Fixed in glibc-2.17
Comment 9 Jackie Rosen 2014-02-16 19:42:51 UTC Comment hidden (spam)
Comment 10 Florian Weimer 2014-06-27 12:52:54 UTC
I think this can be triggered by (untrusted) authoritative servers with some resolvers, so it this is a denial-of-service vulnerability.

Commit 4769ae77fc6c8dacea6476addb015c8797848cdd (see comment #1) went into glibc 2.14, so versions from 2.14 to 2.16 are affected by this.
Comment 11 Andrew Pinski 2015-05-14 23:51:11 UTC
Just for reference since I did not see it in this bug, the commit was done as:
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=cc8bb21c8ad619148c022af6e39ca8a5086a6a88