Bug 12994 - getaddrinfo fails if response records returned in wrong order and one of them is server failure
Summary: getaddrinfo fails if response records returned in wrong order and one of them...
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: network (show other bugs)
Version: 2.14
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 13651 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-07-13 04:22 UTC by jik@kamens.us
Modified: 2014-06-27 12:55 UTC (History)
9 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
tcpdump capture from getaddrinfo en.wikipedia.org (5.49 KB, application/x-pcap)
2011-07-13 04:22 UTC, jik@kamens.us
Details
test program (448 bytes, text/plain)
2011-07-13 04:22 UTC, jik@kamens.us
Details
another test (930 bytes, text/x-csrc)
2012-11-04 10:53 UTC, karme
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jik@kamens.us 2011-07-13 04:22:14 UTC
Created attachment 5848 [details]
tcpdump capture from getaddrinfo en.wikipedia.org

A program calls getaddrinfo.

Deep within the bowels of the resolver library, __libc_res_nquery in res_query.c creates two queries, an A query and an AAAA query.

Deeper within the bowels of the resolver library, send_dg in res_send.c sends both queries and waits for responses. My name server sends the response to the *second* query *first*, and it's a server failure. I'm pretty sure that if the responses were sent in the reverse order, the problem would not occur.

At this point things get all screwed up. I'm not sure whether the problem is in send_dg or _libc_res_nsend or _libc_res_nquery. I've spent hours poring over the code trying to figure out who is at fault. I can't, because this is some of the most poorly written code I've looked at in a very long time. It's completely incomprehensible and most of its "cleverness" is inadequately documented.

Anyway, by the time status results bubble back up to getaddrinfo, the code has decided that it was unable to resolve the host name to an address, even though one of the two responses that came back from the DNS server had a valid A record in it.

Test case? Run getaddrinfo on en.wikipedia.org immediately after restarting your name server. I'm using BIND 9.8.0-7.P4.fc15.x86_64; I don't know how universal this behavior is. I am attaching a wireshark dump from the virtual interface that captures both my loopback interface (on which my client is making its queries) and the queries my DNS server is making to try to satisfy the local queries. And here's what my test program (which I will also attach) prints as output:

Wed Jul 13 00:14:18 2011: getaddrinfo: Name or service not known

Note that if you run the exact same getaddrinfo call a second time immediately afterwards it works, because the previous successful query response, which is a CNAME, is cached and gets returned in response to both the A and AAAA queries.

Since this bug causes DNS queries that should succeed to fail in a very user-visible way, I'm tempted to set it to critical, but I suppose since there's no permanent loss of data it isn't actually. I don't know, tough call.
Comment 1 jik@kamens.us 2011-07-13 04:22:57 UTC
Created attachment 5849 [details]
test program
Comment 2 jik@kamens.us 2011-07-13 04:24:43 UTC
By the way, a workaround for the problem is putting "options single-request" in /etc/resolv.conf.
Comment 3 karme 2012-11-04 10:49:15 UTC
First: I didn't test with the latest glibc because i failed to compile it
But I am quite sure the bug is still present and quite severe. It happens not only if the order is wrong it also happens if there is no answer to the A record request (either request/response is lost, dns server of the typical home router to slow, ...)

I will attach a test with some comments.
Comment 4 karme 2012-11-04 10:53:17 UTC
Created attachment 6714 [details]
another test
Comment 5 karme 2012-11-06 11:49:19 UTC
I now believe packet re-ordering is not enough to reproduce the problem. I have written a small dns proxy for better testing. The simplest scenario to reproduce the problem is to drop all a record requests and just answer the aaaa request.

Answer for getaddrinfo with hints.ai_family = AF_UNSPEC is then error: r=-2 Name or service not known.

Traffic is like:

12.786202    127.0.0.1 -> 127.0.0.1    DNS 68 Standard query 0x9f5f  A karme.de
12.786962    127.0.0.1 -> 127.0.0.1    DNS 68 Standard query 0x77c8  AAAA karme.de
14.896700    127.0.0.1 -> 127.0.0.1    DNS 119 Standard query response 0x77c8 
17.788941    127.0.0.1 -> 127.0.0.1    DNS 68 Standard query 0x9f5f  A karme.de
22.794223    127.0.0.1 -> 127.0.0.1    DNS 68 Standard query 0x9f5f  A karme.de
Comment 6 law 2012-11-06 13:22:26 UTC
Just one question (because I believe this bug is ultimately a duplicate of another existing issue), what is the failure mode you're seeing?  ie, do you hit an assert, abort, segfault, error code, whatever.
Comment 7 karme 2012-11-07 17:32:42 UTC
(In reply to comment #6)
> Just one question (because I believe this bug is ultimately a duplicate of
> another existing issue), what is the failure mode you're seeing?  ie, do you
> hit an assert, abort, segfault, error code, whatever.

getaddrinfo returns error code EAI_NONAME when it should return EAI_EAGAIN
Comment 8 karme 2012-11-27 12:02:30 UTC
(In reply to comment #6)
> Just one question (because I believe this bug is ultimately a duplicate of
> another existing issue), what is the failure mode you're seeing?  ie, do you
> hit an assert, abort, segfault, error code, whatever.

which is the bug number you think this a duplicate of?
Comment 9 law 2012-11-27 16:51:41 UTC
I thought it might be a duplicate of 13013, 13651 or another (# escapes me) in the Red Hat bugzilla database.   Based on the information you provided in c#7 I believe this is a separate issue.
Comment 10 Pavel Šimerda 2012-12-16 15:54:57 UTC
Splitting this bug report so that it only refers to literal address translation.

For /etc/hosts resolutions, see:

http://sourceware.org/bugzilla/show_bug.cgi?id=14966

For other name resolution problems, search for the bug report and file a new one if you don't find it.

That means that if Tore's patch works, this bug can be closed and the other one would be tracked separately.
Comment 11 Siddhesh Poyarekar 2014-04-16 07:34:37 UTC
I've posted a patch[1] that fixes bug 14308, which I think should fix this bug too.  I have tested using some convoluted packet mangling, so I would appreciate it if someone does an actual test with their nameservers.

[1] https://sourceware.org/ml/libc-alpha/2014-04/msg00302.html
Comment 12 Siddhesh Poyarekar 2014-04-16 09:11:35 UTC
*** Bug 13651 has been marked as a duplicate of this bug. ***
Comment 13 Siddhesh Poyarekar 2014-04-30 06:38:20 UTC
Fixed in master:

 Do not fail if one of the two responses to AF_UNSPEC fails (BZ #14308)
    
    [Fixes BZ #14308, #12994, #13651]
    
    AF_UNSPEC results in sending two queries in parallel, one for the A
    record and the other for the AAAA record.  If one of these is a
    referral, then the query fails, which is wrong.  It should return at
    least the one successful response.
    
    The fix has two parts.  The first part makes the referral fall back to
    the SERVFAIL path, which results in using the successful response.
    There is a bug in that path however, due to which the second part is
    necessary.  The bug here is that if the first response is a failure
    and the second succeeds, __libc_res_nsearch does not detect that and
    assumes a failure.  The case where the first response is a success and
    the second fails, works correctly.
    
    This condition is produced by buggy routers, so here's a crude
    interposable library that can simulate such a condition.  The library
    overrides the recvfrom syscall and modifies the header of the packet
    received to reproduce this scenario.  It has two key variables:
    mod_packet and first_error.
    
    The mod_packet variable when set to 0, results in odd packets being
    modified to be a referral.  When set to 1, even packets are modified
    to be a referral.
    
    The first_error causes the first response to be a failure so that a
    domain-appended search is performed to test the second part of the
    __libc_nsearch fix.
    
    The driver for this fix is a simple getaddrinfo program that does an
    AF_UNSPEC query.  I have omitted this since it should be easy to
    implement.
    
    I have tested this on x86_64.
    
    The interceptor library source:
    
    /* Override recvfrom and modify the header of the first DNS response to make it
       a referral and reproduce bz #845218.  We have to resort to this ugly hack
       because we cannot make bind return the buggy response of a referral for the
       AAAA record and an authoritative response for the A record.  */
     #define _GNU_SOURCE
     #include <sys/types.h>
     #include <sys/socket.h>
     #include <netinet/in.h>
     #include <arpa/inet.h>
     #include <stdio.h>
     #include <stdbool.h>
     #include <endian.h>
     #include <dlfcn.h>
     #include <stdlib.h>
    
    /* Lifted from resolv/arpa/nameser_compat.h.  */
    typedef struct {
        unsigned        id :16;         /*%< query identification number */
     #if BYTE_ORDER == BIG_ENDIAN
        /* fields in third byte */
        unsigned        qr: 1;          /*%< response flag */
        unsigned        opcode: 4;      /*%< purpose of message */
        unsigned        aa: 1;          /*%< authoritive answer */
        unsigned        tc: 1;          /*%< truncated message */
        unsigned        rd: 1;          /*%< recursion desired */
        /* fields
         * in
         * fourth
         * byte
         * */
        unsigned        ra: 1;          /*%< recursion available */
        unsigned        unused :1;      /*%< unused bits (MBZ as of 4.9.3a3) */
        unsigned        ad: 1;          /*%< authentic data from named */
        unsigned        cd: 1;          /*%< checking disabled by resolver */
        unsigned        rcode :4;       /*%< response code */
     #endif
     #if BYTE_ORDER == LITTLE_ENDIAN || BYTE_ORDER == PDP_ENDIAN
        /* fields
         * in
         * third
         * byte
         * */
        unsigned        rd :1;          /*%< recursion desired */
        unsigned        tc :1;          /*%< truncated message */
        unsigned        aa :1;          /*%< authoritive answer */
        unsigned        opcode :4;      /*%< purpose of message */
        unsigned        qr :1;          /*%< response flag */
        /* fields
         * in
         * fourth
         * byte
         * */
        unsigned        rcode :4;       /*%< response code */
        unsigned        cd: 1;          /*%< checking disabled by resolver */
        unsigned        ad: 1;          /*%< authentic data from named */
        unsigned        unused :1;      /*%< unused bits (MBZ as of 4.9.3a3) */
        unsigned        ra :1;          /*%< recursion available */
     #endif
        /* remaining
         * bytes
         * */
        unsigned        qdcount :16;    /*%< number of question entries */
        unsigned        ancount :16;    /*%< number of answer entries */
        unsigned        nscount :16;    /*%< number of authority entries */
        unsigned        arcount :16;    /*%< number of resource entries */
    } HEADER;
    
    static int done = 0;
    
    /* Packets to modify.  0 for the odd packets and 1 for even packets.  */
    static const int mod_packet = 0;
    
    /* Set to true if the first request should result in an error, resulting in a
       search query.  */
    static bool first_error = true;
    
    static ssize_t (*real_recvfrom) (int sockfd, void *buf, size_t len, int flags,
    			  struct sockaddr *src_addr, socklen_t *addrlen);
    
    void
    __attribute__ ((constructor))
    init (void)
    {
      real_recvfrom = dlsym (RTLD_NEXT, "recvfrom");
    
      if (real_recvfrom == NULL)
        {
          printf ("Failed to get reference to recvfrom: %s\n", dlerror ());
          printf ("Cannot simulate test\n");
          abort ();
        }
    }
    
    /* Modify the second packet that we receive to set the header in a manner as to
       reproduce BZ #845218.  */
    static void
    mod_buf (HEADER *h, int port)
    {
      if (done % 2 == mod_packet || (first_error && done == 1))
        {
          printf ("(Modifying header)");
    
          if (first_error && done == 1)
    	h->rcode = 3;
          else
    	h->rcode = 0;	/* NOERROR == 0.  */
          h->ancount = 0;
          h->aa = 0;
          h->ra = 0;
          h->arcount = 0;
        }
      done++;
    }
    
    ssize_t
    recvfrom (int sockfd, void *buf, size_t len, int flags,
    	  struct sockaddr *src_addr, socklen_t *addrlen)
    {
      ssize_t ret = real_recvfrom (sockfd, buf, len, flags, src_addr, addrlen);
      int port = htons (((struct sockaddr_in *) src_addr)->sin_port);
      struct in_addr addr = ((struct sockaddr_in *) src_addr)->sin_addr;
      const char *host = inet_ntoa (addr);
      printf ("\n*** From %s:%d: ", host, port);
    
      mod_buf (buf, port);
    
      printf ("returned %zd\n", ret);
      return ret;
    }

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog          |   11 +++++++++++
 NEWS               |   14 +++++++-------
 resolv/res_query.c |    7 +++++--
 resolv/res_send.c  |    2 +-
 4 files changed, 24 insertions(+), 10 deletions(-)