We call getaddrinfo inside our Xorg module. Xorg unfortunately uses setitimer to generate recurring SIGALRM notifications, at 20ms intervals. This can cause various calls to libc to hang. This is a typical stack trace: #0 0xffffe402 in __kernel_vsyscall () #1 0x00960e6d in poll () from /lib/tls/i686/nosegneg/libc.so.6 #2 0x0098e431 in clntudp_call () from /lib/tls/i686/nosegneg/libc.so.6 #3 0x00b49f84 in do_ypcall () from /lib/libnsl.so.1 #4 0x00b4a6c0 in yp_match () from /lib/libnsl.so.1 #5 0xf77aa351 in internal_gethostbyname2_r () from /lib/libnss_nis.so.2 #6 0x009809bb in gethostbyname2_r@@GLIBC_2.1.2 () from /lib/tls/i686/nosegneg/libc.so.6 #7 0x0094f4ea in gaih_inet () from /lib/tls/i686/nosegneg/libc.so.6 #8 0x00952c2d in getaddrinfo () from /lib/tls/i686/nosegneg/libc.so.6 The issue is that clntudp_call retries calls to poll() with every EINTR, but does not adjust the timeout. See sunrpc/clnt_udp.c:L403. poll() is called repeatedly with the same timeout; we want clntudp_call() to eventually return within utimeout seconds, but when setitemer is using a shorter timeout, clntudp_call loops forever. The fix is to adjust the timeout to poll each time we loop. (This is the normal way to handle EINTR with timeouts.)
Created attachment 7871 [details] patch to fix the problem The problem is not limited to nis or udp rpc; there are various parts of glibc that retry polls without recomputing timeouts. I'm testing this patch to fix them all.