This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
On glibc's resolver
- From: Dimitrios Apostolou <jimis at gmx dot net>
- To: libc-alpha at sourceware dot org
- Date: Wed, 26 Dec 2012 05:14:23 +0200 (EET)
- Subject: On glibc's resolver
Hello list,
I was trying to write a patch for glibc so hopefully this is the
appropriate list, please let me know otherwise.
I have been tracing weird behaviour of my mail client (alpine) and ended
up in getaddrinfo() calls, which are handled by glibc's resolver. In
particular, when I connect my laptop to different networks and the
previous DNS server is unreachable, resolver never re-reads its cache and
all queries timeout after several retries.
Apparently this is a known issues, and a web search reveals discussions
from as early as 2003. I'd appreciate your opinions, I was thinking of
writing a patch but I can't figure out where it should go, alpine or
glibc, code or documentation! Here are the replies I gathered from a web
search:
1) Use a caching daemon (nscd maybe, some argue that it does not provide a
solution) which should be restarted/reloaded when changing networks.
2) Call res_init() if getaddrinfo() fails.
3) Patch glibc to stat() /etc/resolv.conf, checking for changes. Debian,
Ubuntu are patched.
4) Use a custom DNS library, glibc is unsuitable for this purpose.
Here is my take. About nscd, I'm having the problem on a major distro
(Fedora) so I can only guess there are good reasons for not using it by
default.
On (2), res_init() is a BSD non-standard function, and its man page
doesn't mention such a purpose. In fact I can't be sure if it's safe to
call it multiple times and I see no guarantee that it will re-initialise
the resolver more than once. If it's the proposed way shouldn't it be
mentioned in both res_init() and getaddrinfo()'s man pages, or otherwise a
big warning that resolv.conf is never reparsed?
On (3) I don't have a Debian system to check it, but the overhead of
stat'ing on every request is probably unacceptable. I was thinking of
writing a patch that would stat() and reparse after a single request
timeout, so that following retries (unless RES_DFLRETRY is reached) will
automatically connect to the new servers. Would that be acceptable?
Finally using a custom library sounded logical, until I started reading
glibc's resolver. Really, with such size and complexity and even
asynchronous interface provided, shouldn't we also provide the simplest
facilities?
And a related question, is there a way to setup resolver behaviour
(timeout, retries) for a process programmatically, instead of changing the
system-wide resolv.conf?
Thank you in advance,
Dimitris