Bug 15218 - getaddrinfo uses PTR records for canonname if address family specified
Summary: getaddrinfo uses PTR records for canonname if address family specified
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: network (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: 2.19
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 17215 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-03-01 06:19 UTC by Greg Hudson
Modified: 2014-08-04 08:12 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Test program demonstrating getaddrinfo issue (436 bytes, text/x-csrc)
2013-03-01 06:19 UTC, Greg Hudson
Details
Candidate fix (213 bytes, patch)
2013-03-02 06:29 UTC, Greg Hudson
Details | Diff
Candidate fix 2 (264 bytes, text/plain)
2013-03-02 06:50 UTC, Greg Hudson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Greg Hudson 2013-03-01 06:19:38 UTC
Created attachment 6909 [details]
Test program demonstrating getaddrinfo issue

With today's master, getaddrinfo with AI_CANONNAME yields the right ai_canonname (the result of CNAME resolution but not PTR lookup) if no other hint fields are given.  However, if hint.ai_family is set to INET6, it appears to do a PTR lookup.  The attached test program demonstrates the problem (the first and third output lines in particular):

$ ./a.out ptr-mismatch.kerberos.org
AI_CANONNAME alone: www.kerberos.org
AI_ADDRCONFIG also: www.kerberos.org
ai_family AF_INET : KERBEROS-ORG.MIT.EDU
ai_family AF_INET6: Name or service not known
Comment 1 Rich Felker 2013-03-01 19:55:54 UTC
To clarify what's wrong: it was a common historic misunderstanding that "canonical" name meant reverse DNS lookups. This was a cause of bad lookup performance in applications that were using AI_CANNONNAME correctly and not respecting it to perform PTR lookups. For a reference on why the PTR lookup is incorrect, see the following paragraphs in POSIX:

From DESCRIPTION of getaddrinfo:

"If the AI_CANONNAME flag is specified and the nodename argument is not null, the function shall attempt to determine the canonical name corresponding to nodename (for example, if nodename is an alias or shorthand notation for a complete name).

Note:
Since different implementations use different conceptual models, the terms ``canonical name'' and ``alias'' cannot be precisely defined for the general case. However, Domain Name System implementations are expected to interpret them as they are used in RFC 1034.
A numeric host address string is not a ``name'', and thus does not have a ``canonical name'' form; no address to host name translation is performed. See below for handling of the case where a canonical name cannot be obtained."

And from APPLICATION USAGE:

"The term ``canonical name'' is misleading; it is taken from the Domain Name System (RFC 2181). It should be noted that the canonical name is a result of alias processing, and not necessarily a unique attribute of a host, address, or set of addresses. See RFC 2181 for more discussion of this in the Domain Name System context."

Source: http://pubs.opengroup.org/onlinepubs/9699919799/functions/getaddrinfo.html
Comment 2 Greg Hudson 2013-03-02 06:29:42 UTC
Created attachment 6912 [details]
Candidate fix

I stepped through the code and found that:

* In the good case (hint.ai_family == 0), line 569 of gaih_inet does not trigger and we continue on to the loop at line 832, using gethostbyname4_r functions.  When the DNS function succeeds, we set canon from the result at line 892.  This value of canon is later used for ai_canonname.

* In the bad case (hint.ai_family == AF_INET), line 569 of gaih_inet triggers and we use __gethostbyname2_r for the lookup.  This branch of the code does not set canon, so later on at line 1119, canon is still NULL.  The conditional there kicks in and sets canon using __gethostbyaddr_r on the first address.

I think the code which uses __gethostbyname2_r ought to be able to set canon using th.h_name.  If I use the attached patch, my test program gives the correct answer with hint.ai_family == AF_INET.
Comment 3 Greg Hudson 2013-03-02 06:50:40 UTC
Created attachment 6913 [details]
Candidate fix 2

This updated patch is more consistent with how other branches of the function set canon.

I believe h->h_name should still be valid by the time canon is used at the end of the function, because it lives in tmpbuf just like it does in the gethostbyname4_r case.
Comment 4 Greg Hudson 2013-03-02 07:11:43 UTC
Another approach can be found at:

http://pkgs.fedoraproject.org/cgit/glibc.git/plain/glibc-fedora-gai-canonical.patch

which completely avoids the gethostbyname2_r path if AI_CANONNAME is requested, and also rips out the code to use gethostbyaddr_r for canonname.  Although that change is much more invasive than my candidate fix, it has received more testing.
Comment 5 Andreas Schwab 2013-10-17 14:36:39 UTC
Fixed by b957ced.
Comment 6 Jackie Rosen 2014-02-16 19:41:16 UTC Comment hidden (spam)
Comment 7 Andreas Schwab 2014-08-04 08:12:48 UTC
*** Bug 17215 has been marked as a duplicate of this bug. ***