I got a (hopefully not so crazy) idea about getaddrinfo. Since the purpose of getaddrinfo is to look up endpoints from symbolic descriptions (host and protocol names), isn't it true that getaddrinfo not only could, but even *should* pay attention to DNS SRV records? For example, if I call getaddrinfo("dolda2000.com", "http", ...), getaddrinfo could be checking the _http._tcp.dolda2000.com (and ..._udp...) name for SRV records, and if they exist, return addrinfo items for the nodes that the SRV records point to. In my mind, it would increase flexibility a very large lot. If this suggestion is welcome, I'd be happy to write code for it. Are there any obvious problems that I've missed in my enthusiasm? If it would make getaddrinfo all too nonstandard, surely it could be turned on explicitly with a flag?
getaddrinfo is documneted in RFC2553 and your suggestion seems to break the spec. We therefore will not implement it.
I just read RFC 2553, and I don't really see how it would break it. The only thing I can see is that a symbolic service like one resolved through an SRV name isn't exactly a "node" name. However, if that is the problem, can this behavior not simply be explicitly turned on using a flag in the hints structure, such as AI_SYMBOLICNODE or similar? That way, it wouldn't break anything.
If you want this functionality, write a new NSS module. It should have a function like int getportbyhost (const char *host, const char *service) The function should do the obvious lookup. Then getaddrinfo could, if /etc/nsswitch.conf contains an appropriate entries, determine the port number using the module. Somebody needs to write the patch.
Actually, it won't work. RFC 2782 specifies that all SRV records have a host name attached. It's not just information about the port. Given that, what would getaddrinfo ("host1.domain", "someserv", ....) mean if the SRV record for _someserv._tcp.domain has the host name "host2.domain" associated? It makes no sense. There can only be a functions which queries the SRV records based on the service name alone. Trying to embed this into getaddrinfo is no good.
If: _someserv._tcp.domain has the host name "host2.domain" associated, Then the proper response would be for getaddrinfo() to do a lookup(name to ip) on "host2.domain". Yes, this is more work than usually happens for a getaddrinfo() call, but I think this would be the proper way to handle the srv record support. In file: sysdeps/posix/getaddrinfo.c after the section of code: line 2131: "if (service && service[0])" would be a good place to add the srv record lookup. If we know the service, and we know the protocol, then we should first query for SRV records, and if none are returned, then continue on as the code currently works. But if SRV records are returned, then handle them. Thoughts?
More specifically: Given than applications that use getaddrinfo() pass in a host name, (sometimes) a service name, also struct addrinfo *hints, plus the response struct addrinfo **res. Usually after checking for an error pass the res structure directly into the connect() function. The current advantage is that the application does not have to deal with ipv4 or ipv6 differences(unless it wants to restrict to only one of the two). In the file sysdeps/posix/getaddrinfo.c and a few lines into the function gaih_inet() right after the protocol and socket type are checked, I propose adding a check to see "if(service != NULL && (req->ai_flags & GAI_SRV_ENABLE)) { /* handle SRV lookups */}" This way if the SRV lookup does return a list of addresses then getaddrinfo() will return the SRV records, in the order they should be used, but if the SRV lookup does not return any useful records, then getaddrinfo() will fall back to the standard lookups below. Thoughts?
(In reply to comment #4) > Actually, it won't work. RFC 2782 specifies that all SRV records have a host > name attached. It's not just information about the port. Given that, what would > > getaddrinfo ("host1.domain", "someserv", ....) > > mean if the SRV record for > > _someserv._tcp.domain I'm quite sure that when the SRV spec says "domain", it is referring to the full domain. Not the domain in the sense of domainname(1). I.e., you would search for _someserv._tcp.host1.domain instead of _someserv._tcp.domain. Am I misreading the spec here? > has the host name "host2.domain" associated? It makes no sense. There can only > be a functions which queries the SRV records based on the service name alone. > Trying to embed this into getaddrinfo is no good. If the word "domain" in the SRV spec is interpreted properly, this objection makes no sense. Sure, it is likely enough that getaddrinfo("domain", "someserv", ...) will not tell you to go ahead and connect directly to "domain". But getaddrinfo("host1.domain", "someserv", ...) would likely not hit any SRV records at all and fall back to the traditional DNS lookups. The main objection to this change would be that programs would suddenly break if getaddrinfo(node, serv, ...) would suddenly tried to find the appropriate host for accessing serv at node. In reality, few domains set SRV records for services where there is no program support. So, most programs which would be affected by this change would behave no differently if getaddrinfo() started actually looking up services instead of just hosts. It would be really nice to get SRV support in applications with no added complexity. Maybe the interface provided by ruli http://nongnu.org/ruli/tutorial/getaddrinfo.html is a way to get these advantages without departing too far from getaddrinfo()...
I must agree with binki. > I'm quite sure that when the SRV spec says "domain", it is referring to the > full domain. Not the domain in the sense of domainname(1). I.e., you would > search for _someserv._tcp.host1.domain instead of _someserv._tcp.domain. Am I > misreading the spec here? Exactly. You would for example ask for a client XMPP connection, the service is 'xmpp-server' and socktype is SOCK_STREAM (TCP). You are going to connect as user@example.net, therefore "example.net" is the domain. The getaddrinfo() call would roughly look like this: hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_STREAM; hints.ai_flags = AI_SRVLOOKUP; code = getaddrinfo("example.net", "xmpp-server", &hints, &result); This would translate to DNS SRV query for: _xmpp-server._tcp.example.net And in absence of such a record, it would fallback to A/AAAA records: example.net The way SRV records work is very similar to the way MX records work. > If the word "domain" in the SRV spec is interpreted properly, this objection > makes no sense. Sure, it is likely enough that getaddrinfo("domain", > "someserv", ...) will not tell you to go ahead and connect directly to > "domain". But getaddrinfo("host1.domain", "someserv", ...) would likely not hit > any SRV records at all and fall back to the traditional DNS lookups. > > The main objection to this change would be that programs would suddenly break > if getaddrinfo(node, serv, ...) would suddenly tried to find the appropriate > host for accessing serv at node. In reality, few domains set SRV records for > services where there is no program support. So, most programs which would be > affected by this change would behave no differently if getaddrinfo() started > actually looking up services instead of just hosts. That's correct. But still it's probably better to make it optional. > It would be really nice to get SRV support in applications with no added > complexity. Maybe the interface provided by ruli > http://nongnu.org/ruli/tutorial/getaddrinfo.html is a way to get these > advantages without departing too far from getaddrinfo()... I've seen this one also.
I hope you won't find this offensive but I still feel it's better to reopen this old bug report than opening a new one. And that even that would be fair for for something from year 2005. Cheers, Pavel
I think it would be helpful to begin this discussion with the people responsible for the RFC that originally defined getaddrinfo and/or with the Austin Group (POSIX) from the perspective that it's an important deficiency glibc potentially wants to address, with an interest in building a consensus on how it should work that could subsequently be adopted in the standards.
The current NSS interface does not support combined host/service lookups. NSS service modules only see the host name. This would have to be fixed first, probably by returning a struct addrinfo list directly from the service module, which requires significant interface changes.