Introduction

The purpose of this page is to coordinate an effort to decide on the correct behavior of getaddrinfo() in certain corner cases.

The behavior of getaddrinfo() is governed by POSIX but unfortunately the spec is not entirely clear.

For background see the discussion at bug #15726.

getaddrinfo()

#include <sys/socket.h>
#include <netdb.h>

int getaddrinfo(
    const char *restrict nodename,
    const char *restrict service,
    const struct addrinfo *restrict hints,
    struct addrinfo **restrict result
);

The nodename argument can be either a hostname or an address (a dotted-decimal or coloned-hex string) or NULL, indicating the local machine. The service can be a name or numeral (in string form) or NULL, indicating that network-level addresses should be returned. Either nodename or service (or both) should be non-NULL.

If not NULL, hints causes the returned information to be filtered by family, flags, protocol and/or socket type.

In case of success the function returns a nonempty linked list of addrinfo structures pointed to by result.

Resolving

Most often getaddrinfo() is used to resolve a hostname to an address.

Qualified variants of the given hostname should also be sought.

All available address types (IPv4 and IPv6) should be returned.

The correct sequence of lookups is: in the first information source look up all the different hostname variants and for each variant look up the different address types. For example, if a "hosts" file is the first source then the resolver should first look for all hostname variants in the hosts file before trying DNS.

If a hostname is found in a source (a positive answer) then the resolver should look up all the address types that were requested and then not search any further.

Some sources might also contain the information that the hostname does not exist (negative answer). In case of a negative answer the resolver should also stop searching.

If there is neither a positive or negative answer then the resolver should continue searching until all sources have been searched.

If all sources have been exhausted and no positive answer was obtained then that should be considered a negative answer.

Outcomes

The following outcomes are possible.

The difference between a negative answer and an error is important because in the case of a negative answer there is no point in retrying, whereas in the case of an error there may be some point in retrying (later).

Return value

The spec allows the following return values (http://pubs.opengroup.org/onlinepubs/9699919799/functions/gai_strerror.html):

0
[EAI_AGAIN]
[EAI_BADFLAGS]
[EAI_FAIL]
[EAI_FAMILY]
[EAI_MEMORY]
[EAI_NONAME]
[EAI_OVERFLOW]
[EAI_SERVICE]
[EAI_SOCKTYPE]
[EAI_SYSTEM]

The getaddrinfo() spec explains them this way:

[EAI_AGAIN]
    The name could not be resolved at this time. Future attempts may succeed.
[EAI_BADFLAGS]
    The flags parameter had an invalid value.
[EAI_FAIL]
    A non-recoverable error occurred when attempting to resolve the name.
[EAI_FAMILY]
    The address family was not recognized.
[EAI_MEMORY]
    There was a memory allocation failure when trying to allocate storage for the return value.
[EAI_NONAME]
    The name does not resolve for the supplied parameters.

    Neither nodename nor servname were supplied. At least one of these shall be supplied.
[EAI_SERVICE]
    The service passed was not recognized for the specified socket type.
[EAI_SOCKTYPE]
    The intended socket type was not recognized.
[EAI_SYSTEM]
    A system error occurred; the error code can be found in errno. 

Glibc currently (March 2014) defines the following values (in glibc/sysdeps/posix/gai_strerror-strs.h) in addition to 0.

_S(EAI_ADDRFAMILY, N_("Address family for hostname not supported"))
_S(EAI_AGAIN, N_("Temporary failure in name resolution"))
_S(EAI_BADFLAGS, N_("Bad value for ai_flags"))
_S(EAI_FAIL, N_("Non-recoverable failure in name resolution"))
_S(EAI_FAMILY, N_("ai_family not supported"))
_S(EAI_MEMORY, N_("Memory allocation failure"))
_S(EAI_NODATA, N_("No address associated with hostname"))
_S(EAI_NONAME, N_("Name or service not known"))
_S(EAI_SERVICE, N_("Servname not supported for ai_socktype"))
_S(EAI_SOCKTYPE, N_("ai_socktype not supported"))
_S(EAI_SYSTEM, N_("System error"))
_S(EAI_INPROGRESS, N_("Processing request in progress"))
_S(EAI_CANCELED, N_("Request canceled"))
_S(EAI_NOTCANCELED, N_("Request not canceled"))
_S(EAI_ALLDONE, N_("All requests done"))
_S(EAI_INTR, N_("Interrupted by a signal"))
_S(EAI_IDN_ENCODE, N_("Parameter string not correctly encoded"))

Note the extra values EAI_ADDRFAMILY and EAI_NODATA.

It is clear, at least, that getaddinfo() should:

It is reasonably clear that the following error codes should be returned under the following circumstances.

Some reasons why system might return EAI_SYSTEM and set errno:

But...

From the perspective of the application that calls getaddrinfo() it perhaps doesn't matter that much since EAI_FAIL, EAI_NONAME and EAI_NODATA are all permanent failure codes and the causes are all permanent failures in the sense that there is no point in retrying later.

Currently (March 2014) Ubuntu and FreeBSD return EAI_NONAME in case of permanent failure.

Files specifics

A hostname might be assigned multiple addresses in the hosts file so the whole file has to be checked.

In case the hostname exists but doesn't have the address type that is asked for, should the resolver return the positive answer with no addresses, or what?

Both FQDN and non-FQDNs can be looked up in this source.

DNS specifics

When looking things up in DNS, there could be more than one server which we can ask address of the hostname. If we don't get a answer from the server or get a communication error we should move to the next server. In case there are no answers it should retry it since this goes over UDP and the packet could be lost. There should be some timeout before it gives up trying to look up the host. It should probably also increase the time between sending packets to the same server in case of no answer. Only after the timeout has been reached should we abort with an error.

In case of an invalid packet, retry until timeout, or error?

If the DNSSEC verification fails for whatever reason, should this be treated as a fatal error?

mDNS specifics

mDNS can only do lookups in the .local domain, and so the hostname should always be a FQDN.

mDNS queries should be retried until there is an answer, a fatal error or a timeout is reached (which is also an error). There should probably be an increase in time between packets.

Since this is based on DNS, it can also return the case of no address.

There is a problem with the .local domain since both DNS and mDNS claim to be authoritative over it, and DNS will always return a negative answer for it if root nameserver can be reached. Changing the order of the sources doesn't solve this. Therefore in case of mDNS returning no answer for a host in the .local domain it should prevent the next source from being tried, and DNS should come after mDNS.

Configuration changes

There are various reasons while the configuration files can be changed while a process is running. getaddrinfo() should re-read those files if they got changed.

Relevant Standards

RFC3493 "Basic Socket Interface Extensions for IPv6" http://tools.ietf.org/html/rfc3493.html (obsoletes RFC2553, RFC2113)

RFC6724 "Default Address Selection for Internet Protocol version 6 (IPv6)" https://tools.ietf.org/html/rfc6724 (obsoletes RFC3484)

POSIX Issue 7 "freeaddrinfo, getaddrinfo - get address information" http://pubs.opengroup.org/onlinepubs/9699919799/functions/freeaddrinfo.html

Fedora

Raw API designs

Resolver libraries

None: NameResolver (last edited 2014-06-19 03:02:39 by CarlosODonell)