Differences between revisions 23 and 24
Revision 23 as of 2013-10-20 14:45:26
Size: 8621
Editor: ThomasHood
Comment:
Revision 24 as of 2013-10-20 15:01:36
Size: 8770
Editor: ThomasHood
Comment:
Deletions are marked like this. Additions are marked like this.
Line 46: Line 46:
 * For the case where there was a '''positive answer''' with no addresses there used to be an EAI_NODATA return value specified, but this value is not mentioned in later versions of the standard. The sensible thing would then seem to be to return 0 and an empty list of addresses, but the standard stipulates that when 0 is returned there must be at least one address returned. Either EIA_FAIL or EAI_NONAME could be returned, but either one of these conflates the no-address case with other very different cases. So it seems justified to continue using a distinct EAI_NODATA.  * For the case where there was a '''positive answer''' with no addresses there used to be an EAI_NODATA return value specified, but this value is not mentioned in later versions of the standard. The sensible thing would then seem to be to return 0 and an empty list of addresses, but the standard stipulates that when 0 is returned there must be at least one address returned. Either EIA_FAIL or EAI_NONAME could be returned, but either one of these conflates the no-address case with other very different cases. So it seems justified to continue using a distinct EAI_NODATA. (There was an interesting discussion about this on the bind-users mailing list: https://lists.isc.org/pipermail/bind-users/2011-April/083701.html .)

Introduction

On this page we are trying to describe what the correct behavior should be for a resolver function, i.e., for a function translating a hostname to an address. What is the correct order of looking things up? Which status codes should be returned and when?

The functions implementing this in the GNU C library are getaddrinfo() and gethostbyname(). However, gethostbyname() is deprecated in favor of getaddrinfo(), so the latter is the focus of attention on this page.

The behavior of getaddrinfo() is governed by POSIX but unfortunately the relevant RFCs are not very clear. See the discussion at bug #15726.

Behavior

There are several sources of information that can be used to translate a hostname to an address. This includes "hosts" files, NIS, DNS and mDNS. If needed information is not found in one source then the information can and should be sought in other sources.

The resolver also needs to look up qualified variants of the given hostname.

It might also have to look up different address types: IPv4 versus IPv6.

In short, the resolver needs to look up multiple qualified hostnames in multiple sources, looking for different address types.

Results

The following final results are possible.

  • We got a positive answer that the hostname exists. The hostname could have zero, one or more addresses assigned to it.

  • We got a negative answer that the hostname does not exist.

  • We got no answer after searching all sources.

  • Some error occurred preventing us from completing the search. The error can be either local — reported perhaps by a local subsystem such as the memory allocator — or remote, reported by a remote service such as a nameserver.

Sequence of lookups

The correct sequence is: in the first information source look up all the different hostnames and for each hostname look up the different address types. Thus three nested loops, for source, for name and for address type.

For example, if a "hosts" file is the first source then the resolver should first look for all hostnames in the hosts file before trying DNS.

If a hostname is found in a source (positive answer) then the resolver should look up all the address types that were requested and then not search any further.

Some sources might also contain the information that the hostname does not exist (negative answer). In that case the resolver should also stop searching.

If there is neither a positive or negative answer then the resolver should continue searching until all sources have been searched.

Return value

It is clear at least that the resolver should:

  • return 0 if a positive answer was received and at least one address was obtained;

  • return an error code if an error was encountered.

But

  • For the case where there was a positive answer with no addresses there used to be an EAI_NODATA return value specified, but this value is not mentioned in later versions of the standard. The sensible thing would then seem to be to return 0 and an empty list of addresses, but the standard stipulates that when 0 is returned there must be at least one address returned. Either EIA_FAIL or EAI_NONAME could be returned, but either one of these conflates the no-address case with other very different cases. So it seems justified to continue using a distinct EAI_NODATA. (There was an interesting discussion about this on the bind-users mailing list: https://lists.isc.org/pipermail/bind-users/2011-April/083701.html .)

  • In case there was a negative answer it's unclear what should be returned. Some implementations return EAI_FAIL, others EAI_NONAME. But, again, either one of these conflates the negative-answer case with other very different cases. Of the two, EAI_NONAME is less bad.

The following error codes should be returned in the following circumstances.

  • There was no answer: EAI_AGAIN

  • There was neither a positive nor a negative acknowledgment of the existence of the service name: EAI_AGAIN
  • AI_NUMERICHOST was used and the hostname wasn't a valid numeric string representation of the address: EAI_NONAME
  • AI_NUMERICSERV was used and the service name wasn't a valid number in string form: EAI_NONAME
  • Both nodename and servname are NULL: EAI_NONAME
  • Unknown bits where set in the ai_flags: EAI_BADFLAGS
  • Unknown or unsupported family was used: EAI_FAMILY
  • Unknown or unsupported socket type: EAI_SOCKTYPE
  • There was a failure to allocate memory: EAI_MEMORY
  • There was a answer that indicates that the service name doesn't exist for the given socket type: EAI_SERVICE
  • The source information doesn't make sense, like failing to parse a file, dns returned an invalid packet: EAI_FAIL
  • Some system error occurred and errno is set: EAI_SYSTEM.

Some reasons why system might return EAI_SYSTEM and set errno:

  • We tried to open a file and got an error like EACCES, EMFILE, ... It should probably return EAI_SYSTEM in that case.
  • We tried to create a socket and got an error like EACCES, EINVAL, ... It should probably only return that error in case all the different sockets it tried to create failed.
  • We tried to do network communication (send, sendto, recv, recvfrom) and got an error. It should probably all be treated as non-permanent error and might then result in an EAI_AGAIN.
  • EINTR should always be handled by getaddrinfo() itself. It should not return EAI_SYSTEM in this case.

Files specifics

The hosts file source can only result in a positive answer or no answer. Absence of a hostname from the hosts file should not be taken to imply that the hostname does not exist.

A hostname might be assigned multiple addresses in the hosts file so the whole file has to be checked. In case the hostname exists but doesn't have the address type that is asked for, the resolver should return the positive answer with no addresses.

Both FQDN and non-FQDNs can be looked up in this source.

DNS specifics

When looking things up in DNS, there could be more than 1 server which we can ask address of the hostname. If we don't get a answer from the server or get a communication error we should move to the next server. This should not be treated as a negative answer. In case there are no answers it should retry it since this goes over UDP and the packet could be lost. There should be some timeout before it gives up trying to look up the host. It should probably also increase the time between sending packets to the same server in case of no answer. Only after the timeout has been reached it the answer is no answer.

It's unclear what we should do in case of an invalid packet: same as no answer and retry until timeout?

If the DNSSEC verification fails for whatever reason, it should not be treated as no answer.

DNS can return both positive and negative answers.

DNS can return the case of no address by returning SUCCESS but with an empty answer. In case of DNSSEC we should retry this until we have a signed reply saying this.

Both FQDNs and non-FQDNs can be looked up in this source, but most non-FQDNs will either result no address or a negative answer.

If no nameserver is configured it should either default to localhost or to an empty list of servers. If an empty list of servers is used there is no answer.

mDNS specifics

mDNS can only do lookups in the .local domain, and so the hostname should always be a FQDN. Asking about any other domain or a non-FQDN should result in no answer.

mDNS queries should be retried until there is an answer, error or a timeout is reached. There should probably be an increase in time between packets.

It's unclear what the best behavior is in case their was no reply since it might currently be down or otherwise unreachable. It should probably return no answer.

Since this is based on DNS, it can also return the case of no address.

There is a problem with the .local domain since both DNS and mDNS claim to be authoritative over it, and DNS will always return a negative answer for it if root nameserver can be reached. Changing the order of the sources doesn't solve this. Therefor in case of mDNS returning no answer for a host in the .local domain it should prevent the next source from being tried, and DNS should come after mDNS.

Configuration changes

There are various reasons while the configuration files can be changed while a process is running. getaddrinfo() should re-read those files if they got changed.

None: NameResolver (last edited 2014-03-28 20:12:37 by CarlosODonell)