The default behavior of getaddrinfo results in a way to hijack failed (NXDOMAIN) domain lookups. The man page for resolv.conf(5) says: domain Local domain name. Most queries for names within this domain can use short names relative to the local domain. If no domain entry is present, the domain is determined from the local hostname returned by gethostname(2); the domain part is taken to be everything after the first '.'. Finally, if the hostname does not contain a domain part, the root domain is assumed. Therein lies the problem. The default case is exploitable. If a server has a domain name "companyname.com", the domain part, "everything after the first '.'", is "com". So failed a failed lookup of "xyz.com" is retried as "xyz.com.com". The proprietors of "com.com" have chosen to exploit this by using a wildcard DNS A record for "*.com.com", and redirecting the traffic thus captured to (inevitably) an ad-heavy site. Visit "gnu.com.com", for example. This problem is most visible when the hostname has two components, and the TLD is ".com". Most hosting services use long generated host names, such as "gator123.hostgator.com", and so their default base domain is "hostgator.com". This is less exploitable. There are "net.net" and "org.org" domains, but they are not currently capturing undefined subdomains. There may be other exploits in the country I suggest that the default behavior be changed. Consider defaulting "ndots" to 0, or at least don't use the default domain for searches unless it has more than a TLD. First reported in December 2011 at http://serverfault.com/questions/341383/possible-nxdomain-hijacking by a user who was puzzled that his two seemingly identical test and production servers behaved differently. For me, it's caused a web crawler to misidentify nonexistent domains.
See more details, including a strace log, at http://centos.org/modules/newbb/viewtopic.php?topic_id=36693&forum=59
SOME HISTORY This problem was addressed back in 1993 in RFC 1535, "A Security Problem and Proposed Correction With Widely Deployed DNS Software". Back then, DNS resolution involved chopping off one element at a time of the local hosts' domain and appending that to the query until a match was found. This allowed attacks on domains with more than two elements. That's where the current restrictions come from. RFC 1535 says "At a minimum, DNS resolvers must honor the BOUNDARY between local and public administration, by limiting any search lists to locally- administered portions of the Domain Name space." It then goes on: "DNS Name resolver software SHOULD NOT use implicit search lists in attempts to resolve partial names into absolute FQDNs other than the hosts's immediate parent domain. Resolvers which continue to use implicit search lists MUST limit their scope to locally administered sub-domains." Those two statement are in conflict when a host has a name such as "example.com". The "immediate parent domain" is "com", but it is not a "locally administered subdomain". Since the second sentence is a MUST, while the first sentence is a SHOULD, the second sentence controls. Thus, to conform to RFC 1535, the search path should never default to a TLD. glibc is not in compliance. That RFC was written in 1993, before domains were ever offered for sale. A world in which most domains were second level was not envisioned at the time. That's probably why the authors of the RFC didn't think of this. WORKAROUNDS, FAILURE OF This is a tough problem to work around without changing host names. Adding commands to /etc/resolv.conf tends to cause problems, because, in modern Linux systems, that file is usually generated by system administration software. Editing /etc/resolv.conf is not particularly helpful, anyway. Adding a blank "search" command does not delete the implicit search path. (That behavior comes from lines 267-268 in res_init.c, which, for a blank "search" line, does nothing. The code in that area seems to be set up so that there is always an alternate search path of some kind, either from SEARCH statement in resolv.conf, a DOMAIN statement, an environment variable, or the host name of the host. Nor does setting "ndots:0" have that effect. Setting "domain" in /etc/resolv.conf to a value with an invalid TLD might work, but could confuse other parts of the system. "no_tld_query" only applies to inputs with no dots (I think) so that doesn't help, either. EFFECT The effect of all this is that you can't trust "getaddrinfo" or "gethostbyname" to do an honest DNS lookup.
Are there any plans to fix this soon?
This bug was filed before the vast expansion of TLDs. There may be new exploits possible once there are hundreds of new TLDs. One implication of all the new TLDs is that single-word domains (especially corporate domains, like WALMART) may have to be resolved on a routine basis. This has been discussed in the browser community, but it has implications here, too. I'm not sure what to do here, but someone needs to be coming up with a standard solution to this.
John, one of the suggestions was to enhance glibc to have the option to avoid appending domains to failed lookups. I'd think a new resolv.conf option rather than using something like an empty "search" specification would be better. If anyone knows if Solaris or other systems have invented such a mechanism, now would be a good time to know so we can try to be configuration file compatible.
As a workaround you can try 'domain .' WorksForMe(tm); but the file does get overwritten by NetworkManager. getdomainname(2) still returns the actual domainname, in my simplistic tests. I stumbled across this from a slightly different angle: - the corporate network sets the hostname. - connecting over a non-corporate network then keeps the machine hostname, but 'search ...' is blank. - externally the corporate network then ends up giving wildcards when there's a failed lookup. The end result is the same: typos result in answers rather than NXDOMAIN. I'm for disabling implicit search lists if 'search ...' is blank. I'll add network-specific search lists as I connect. (I also want the corporate DNS to not do this -- and have queried that internally, but DNS lookups end me on sedoparking eventually)
*** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.
I'm closing this in favor of bug 25163. The challenge here is that many Kubernetes deployments require the current flavor of search list processing. *** This bug has been marked as a duplicate of bug 25163 ***
*** This bug has been marked as a duplicate of bug 25163 ***