Created attachment 6493 [details] Packet capture illustrating duplicate (redundant) DNS requests I noticed spurious DNS requests on the network, and traced them down to getaddrinfo() calls (e.g. using wget). This occurs on a Fedora 16 machine with latest glibc, as well as a CentOS machine with older versions To reproduce, perform the following in 2 consoles: $tcpdump -v -n -i eth0 udp port 53 -w getaddr.pcap $wget -4 http://www.gmail.com/ Result: reading from file ./getaddr.pcap, link-type EN10MB (Ethernet) 17:59:38.263471 IP 192.168.1.9.54553 > 8.8.8.8.domain: 32628+ A? www.gmail.com. (31) 17:59:38.313021 IP 8.8.8.8.domain > 192.168.1.9.54553: 32628 4/0/0 CNAME mail.google.com., CNAME googlemail.l.google.com., A 74.125.226.22, A 74.125.226.21 (116) 17:59:38.313311 IP 192.168.1.9.57920 > 8.8.8.8.domain: 44145+ A? www.gmail.com. (31) 17:59:38.367738 IP 8.8.8.8.domain > 192.168.1.9.57920: 44145 4/0/0 CNAME mail.google.com., CNAME googlemail.l.google.com., A 74.125.226.22, A 74.125.226.21 (116) 17:59:38.476300 IP 192.168.1.9.50993 > 8.8.8.8.domain: 25319+ A? mail.google.com. (33) 17:59:38.561056 IP 8.8.8.8.domain > 192.168.1.9.50993: 25319 3/0/0 CNAME googlemail.l.google.com., A 74.125.226.21, A 74.125.226.22 (92) 17:59:38.561306 IP 192.168.1.9.54508 > 8.8.8.8.domain: 35020+ A? mail.google.com. (33) 17:59:38.610485 IP 8.8.8.8.domain > 192.168.1.9.54508: 35020 3/0/0 CNAME googlemail.l.google.com., A 74.125.226.21, A 74.125.226.22 (92) 17:59:38.716233 IP 192.168.1.9.59783 > 8.8.8.8.domain: 48556+ A? accounts.google.com. (37) 17:59:38.765100 IP 8.8.8.8.domain > 192.168.1.9.59783: 48556 2/0/0 CNAME accounts.l.google.com., A 209.85.225.84 (78) 17:59:38.765348 IP 192.168.1.9.41685 > 8.8.8.8.domain: 24558+ A? accounts.google.com. (37) 17:59:38.819577 IP 8.8.8.8.domain > 192.168.1.9.41685: 24558 2/0/0 CNAME accounts.l.google.com., A 209.85.225.84 (78)
Created attachment 6495 [details] Test program that triggers the bug (when compiled in x86_64 mode) The attached test program triggers the bug when compiled as a 64-bit executable Use: gcc -O -g dnstest.c When compiled as 32-bit (-m32), it works as expected When setting hints.ai_flags = AI_CANONNAME; both an A and an AAAA query are sent, even though hints.ai_family = AF_INET; It looks like a struct member alignment problem, causing hints to be misinterpreted
Reproducible on HEAD. The size of the response is what seems to be causing the second query -- 6 or more seems to be the trigger. My initial guess was that the first lookup led to an ERANGE internally, resulting in the second query, but that should have happened for 32-bit as well. So this might well be something else. This has nothing to do with whether there is a CNAME in the response.
(In reply to comment #2) > Reproducible on HEAD. > > The size of the response is what seems to be causing the second query -- 6 or > more seems to be the trigger. My initial guess was that the first lookup led to > an ERANGE internally, resulting in the second query, but that should have > happened for 32-bit as well. So this might well be something else. This has > nothing to do with whether there is a CNAME in the response. You are correct that __gethostbyname2_r in sysdeps/posix/getaddrinfo.c returns ERANGE the first time when using a 512-byte buffer, in 64-bit mode. It is then retried with a 1024-byte buffer, and the call succeeds However, the same call in 32-bit mode also uses a 512-byte buffer, and does not return ERANGE.
in function getanswer_r (resolv/nss_dns/dns-host.c) the temporary buffer is used for the following struct: struct host_data { char *aliases[MAX_NR_ALIASES]; unsigned char host_addr[16]; /* IPv4 or IPv6 */ char *h_addr_ptrs[0]; } *host_data sizeof( struct host_data ) == 400 for x86_64, and 208 for x86 32-bit. One could argue that the code works as designed, but I believe it is desirable if glibc exhibits the same externally observable behaviour for both x86 and x86_64 versions. 512 - 400 leaves little room for common DNS query responses like "www.google.com". One could simply allocate a larger temporary buffer for x86_64 ( e.g. 512 * sizeof(void*) / 4 ), but perhaps a better fix is to not use the temporary buffer for the host_data struct at all (and e.g. use alloca instead, or simply place it on the stack - although the code does explicitly align it in case the buffer is unaligned, which it isn't btw)
Created attachment 6498 [details] Proposed patch for /sysdeps/posix/getaddrinfo.c The current code initially allocates a buffer of 512 bytes for the query response, and repeatedly increases the buffer size (*2) until the call succeeds. Out of this 512 bytes, sizeof(host_name_struct) is used for alias pointers. The size of this struct is 208 bytes on x86 and 400 bytes on x86_64, creating a difference in network behaviour for these targets. Furthermore, 512-400 leaves only 112 bytes for the answer, which is too small for common DNS lookups such as "www.google.com" The 512 value was probably based on the maximum response size limit imposed by DNS. This patch adds the size of the host_name_struct, such that 512 bytes remain for the DNS response, for both the x86 and x86_64 platforms. Note that there should be a cleaner way for linking the size of the struct to the code in resolv/nss_dns/dns-host.c; the constant of 48 for MAX_NR_ALIASES might be changed in the future. Also, I noticed that there is no limit imposed on the size of the temp buffer. Not sure what the effect would be of receiving a UDP jumbogram over IPv6 in response to a DNS query?
*** Bug 13904 has been marked as a duplicate of this bug. ***
Here's a review of the patch. Please post the patch with modifications to the libc-alpha mailing list for further review and approval and use the following wiki link as a guideline for the submission: http://sourceware.org/glibc/wiki/Contribution%20checklist - size_t tmpbuflen = 512; + // MAX_NR_ALIASES=48 in resolv/nss_dns/dns-host.c + size_t host_data_size = 48 * sizeof(char*) + 16; Put MAX_NR_ALIASES one of the headers and use it here. + size_t tmpbuflen = 512; assert (tmpbuf == NULL); - tmpbuf = alloca_account (tmpbuflen, alloca_used); + tmpbuf = alloca_account (tmpbuflen+host_data_size, alloca_used); Adjust the tmpbuflen size instead of adding it separately here. int rc; struct hostent th; struct hostent *h; @@ -579,19 +581,19 @@ gaih_inet (const char *name, const struct gaih_service *service, while (1) { rc = __gethostbyname2_r (name, AF_INET, &th, tmpbuf, - tmpbuflen, &h, &herrno); + tmpbuflen+host_data_size, &h, &herrno); Likewise. if (rc != ERANGE || herrno != NETDB_INTERNAL) break; if (!malloc_tmpbuf - && __libc_use_alloca (alloca_used + 2 * tmpbuflen)) + && __libc_use_alloca (alloca_used + 2 * tmpbuflen + host_data_size)) Here too. tmpbuf = extend_alloca_account (tmpbuf, tmpbuflen, - 2 * tmpbuflen, + 2 * tmpbuflen+host_data_size, And here. alloca_used); else { char *newp = realloc (malloc_tmpbuf ? tmpbuf : NULL, - 2 * tmpbuflen); + 2 * tmpbuflen+host_data_size); Here as well. if (newp == NULL) { result = -EAI_MEMORY; > Also, I noticed that there is no limit imposed on the size of the temp buffer. > Not sure what the effect would be of receiving a UDP jumbogram over IPv6 in > response to a DNS query? The current implementation is broken for it, since it will keep blowing the buffer up exponentially till it can fit the entire jumbogram into the buffer and hope that the underlying memory supports it. This needs to be fixed in future.
Fixed in master: http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=7b6e99be77c24a79cb07416d81796b45176923c6
*** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.