Support for DNSSEC in the GNU C Library

Contents

Introduction
Configuration
Programming Interfaces (APIs)
1. Stub Resolver Enhancements
2. NSS Enhancements
Policy
Open Questions
References

1. Introduction

The solutions discussed on this page were implemented on 2019-10-19 in commit 446997ff1433d33452b81dfa9e626b8dccf101a4 and released as part of glibc 2.31 on 2020-02-01.

Domain Name System Security Extensions (DNSSEC) as described by NSEC3 represents, today, the best practice for providing DNS data integrity.

As of today (2015-10-30) the stub resolver in glibc passes the DNSSEC AD-bit information to all callers of the stub resolver API (via RES_USE_EDNS0 and RES_USE_DNSSEC). The client can see if the recursive resolver or an authoritative resolver have verified the cryptographic signatures (validating resolver). Unfortunately the client cannot implicitly trust the AD-bit information in the glibc APIs because there is no API with which to determine such trust chain information.

The AD-bit for DNSSEC is an on-the-wire DNS implementation detail that should not be exposed to applications via higher level APIs. The applications looking at the AD-bit are doing so because it is the simplest mechanism to use, with the next simplest being to link with a DNS library and use that libraries trust API to harden the application. One example would be to link with the getdns library.

It would be beneficial if glibc could support DNSSEC in the following areas:

Configuration.
Programming interfaces (APIs).
Policy.

2. Configuration

The very first problem when using DNSSEC is trusting the AD-bit and the meaning it has, namely that the recursive resolver you have configured has validated the result as "secure."

Several solutions have been proposed for this problem. The first one being to create a /etc/trusted-resolv.conf that contains sources of trusted DNS information for which the AD-bit would be allowed to pass to clients. While those entries in /etc/resolv.conf would have their AD-bit stripped. Such a solution is complicated for many reasons including the addition of a new configuration file, training system administrators to use it, and applications and tools to process it. All of these add undue cost to adopting such a solution. Lastly, no proposal mitigates bad applications from writing also to /etc/trusted-resolv.conf and perpetuating the same problem that had been had with /etc/resolv.conf. The design shifts the problem to another file but doesn't solve it. In an attempt to clarify the meaning of /etc/resolv.conf Carlos O'Donell submitted and had committed a patch to the linux man pages project to make it clear the community considers /etc/resolv.conf to be a trusted source of resolvers. The suggestion of several senior community members including Roland McGrath, Carlos O'Donell, and Rich Felker has been that a new file is not the correct solution given the existing requirements.

The recommended course of action is twofold:

Keep /etc/resolv.conf the only source of trusted name servers, and have all downstream distributions follow solutions like "Fedora Change Request: Default Local DNS Resolver" or "Improvement to DNS resolving in Ubuntu". These solutions rephrases the question in terms of higher-level policy, trust, and network interfaces, which are all constructs that the low-level stub resolver does not need to know.
Add a ad-flag option to be used with the options keyword in /etc/resolv.conf and implement a fail-safe mechanism that sets the AD-bit to zero if ad-flag is not set in options. This makes legacy installations fail-safe and newer installations can set the ad-flag after initialization their preferred trusted system configuration. The ad-flag will map into a private RES_AD_FLAG option that is used only internally by the implementation, and applications must not set it. However, applications may examine the options field and if RES_AD_FLAG is set to 1 then the application can infer the runtime knows about the new feature and is in a secure mode where /etc/resolv.conf is fully trusted. Alternatively the _flags private member of the resolver structure can be used with a new internal flag and a public macro which applications use to check for the trusted state, this has the benefit of hiding the internal details but makes one bit of the _flags member a part of the public API (whose meaning can never change). Note that we use a macro to support backporting to older distributions without impacting the exposed ABI/API, but we must assure that _flags is set to zero (it should always be set to 0 by __res_vinit and others), this way an application can check for trust even if the glibc in use doesn't support the new options.

Solutions that include adding a validating component to the glibc stub resolver are not being considered because of the maintenance cost and security implications. Adding significant crypto to the stub resolver would cause it to be available in all processes and increase the attack surface for all processes. In the local validating resolver case it is just one process, the resolver, that is compromised and not all of the system processes, some of which may not need DNS.

A proposal by Zack Weinberg here explains how nscd could be used instead of a local validating resolver. The points expressed are compelling, but the most significant problem is that it couples caching with DNSSEC validation. Some sites may not need caching and may see this as problematic to their configuration. Consider a site that needs uncached NSS information, such a site would have to enable nscd and then use SELinux AVC's to disable access to the relevant database caches for clients (results in a lookup attempt, failure, and fallback). Therefore it's possible to put the DNSSEC validation in nscd, but it still doesn't solve the API and policy questions and complicates site configuration. One could make nscd DNSSEC-aware and that information could be cached, but that is a distinct issue.

3. Programming Interfaces (APIs)

3.1. Stub Resolver Enhancements

Given the requirements for a system that fails safe, and making DNSSEC easy to use, there are two potential glibc stub resolver enhancements to be considered:

A set of new flags to increase the usability of the exiting APIs with regard to DNSSEC, but the design must be considered thoroughly before inclusion. It would be better if an experimental library adopted the changes and tested them out before adoption in the glibc stub resolver. See DNSSEC support in stub-resolver for a discussion of the required API changes.

3.2. NSS Enhancements

NSS (Name Service Switch) provides a higher-level interface, and includes host name resolution support, which in turn can use DNS and libresolv.

getaddrinfo could be enhanced to return secure replies only, or provide trust flags for the data it returns. The latter is difficult because struct addrinfo has a fixed size and is part of the ABI. But glibc provides a specialized deallocation function, freeaddrinfo, so workarounds are possible.
It may make sense to provide higher-level APIs for key material access and expose them through NSS. For example, an application might want to obtain the expected X.509 certificate of a TLS server, without knowing if it is stored in DNS (via DANE) or in LDAP.

4. Policy

What kinds of policies are required to correctly implement and support DNSSEC?

Misconfiguration?
- It is well understood that misconfigurations of DNSSEC systems happen. How do site administrators handle scenarios where exceptions need to be made?
System defaults?
- How are defaults selected, either to drop all non "secure" responses or to drop only non "secure/indeterminate"?

These questions and more should be answered by a policy framework. Today that policy framework is probably going to be very simple.

5. Open Questions

Is it sufficient to make changes to libresolv, or is there an impact on NSS as well?
Does the search path (search directive in resolv.conf) need protection?
What happens to the AD bit in a reply if it was resolved with the help of the search path (i.e., the queried name does not match what the application specified)?
If a name resolution is the result of multiple queries and replies, what should be done if some of the replies carry the AD bit, and others do not?
Would splitting libresolv and nss_dns from glibc help? (Something similar has been done with the Sun RPC implementation, but it could not be completed because of ABI impact.)
Is access to the AD bit sufficient? What about zones preloaded into the recursive resolver which cause to set the AA bit in responses, but not the AD bit? Would applications expect a synthesized AD bit on the response in this case?
What are the current IETF guidelines on application use of the AD bit?