[Bug network/24047] libresolv should use IP_RECVERR to avoid long timeouts

sourceware at isomer dot meta.net.nz sourceware-bugzilla@sourceware.org
Tue Jan 8 09:10:00 GMT 2019


https://sourceware.org/bugzilla/show_bug.cgi?id=24047

--- Comment #5 from Perry Lorier <sourceware at isomer dot meta.net.nz> ---
I've spent today continuing the investigation, spinning up different kernel
versions, on many different VMs, finding bare metal machines to try various
experiments upon, and so on.   I've finally think I've figured out what is
going on.

With a connect()'d socket, you will receive *non-transient* errors (as defined
in RFC1122, section 3.2.2.1).  See
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/icmp.c#n121

and RFC1122, section 3.2.2.1:
            A Destination Unreachable message that is received with code
            0 (Net), 1 (Host), or 5 (Bad Source Route) may result from a
            routing transient and MUST therefore be interpreted as only
            a hint, not proof, that the specified destination is
            unreachable [IP:11].  For example, it MUST NOT be used as
            proof of a dead gateway (see Section 3.3.1).

So without IP_RECVERR, the kernel will mask so called "transient" errors (such
as destination host/net unreachable) and deliberately *not* report them.

With IP_RECVERR, the kernel dutifully passes them along, and expects user space
to understand how to handle them appropriately.

Port unreachable, admin prohibited, etc, are considered "non-transient" and are
reported back to connected UDP sockets.

I believe that the DNS needs to be treated as a "realtime"-like protocol, as it
may be blocking some realtime behaviour (such as a human surfing a webpage),
and as such the DNS needs to understand that a server is (potentially
temporarily) unreachable, and be able to try an alternate server that,
hopefully, is available.

And, thus, I believe resolv should, still, use IP_RECVERR, to receive all ICMP
errors to decide to retransmit / move on to the next nameserver.

(Also, if anyone's interested, feel free to ask me about why this behaviour is
all due to the historic way ICMP Redirects work...  But I feel that's well
outside the scope of this bug)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Glibc-bugs mailing list