Bug 14889

Summary: CVE-2011-4609 svc_run() produces high cpu usage when accept() fails with EMFILE
Product: glibc Reporter: law
Component: networkAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 2.17   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description law 2012-11-28 21:09:05 UTC
If a process which calls svc_run() runs out of limit on opened files for a
longer time, accept() in rendezvous_request()/svcudp_recv() fails with EMFILE,
which leads to looping between poll(), accept() and 'for' loops which are
consuming a lot of cpu.
1. start portmap

# ulimit -n 1024
# service portmap restart

2. create 'ulimit -n - number_of_opened_files_by_portmap' connections to
portmap:

# PORTMAP_ULIMIT_N=1024; OPENED_FDS=$(find /proc/`pidof portmap`/fd -type l |
wc -l); ulimit -n $(($PORTMAP_ULIMIT_N * 8));
c=$(($PORTMAP_ULIMIT_N-$OPENED_FDS)); while [ $c -gt 0 ]; do (nc -d localhost
111 &); ((c--)); done

3. check that CPU usage is still low:

# top -b -n 1 -p `pidof portmap`
top - 11:41:03 up  3:47,  3 users,  load average: 0.29, 0.34, 0.34
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s): 21.0%us, 11.5%sy,  0.6%ni, 61.3%id,  4.9%wa,  0.5%hi,  0.2%si,  0.0%st
Mem:   1001616k total,   981804k used,    19812k free,    27520k buffers
Swap:  1020116k total,       76k used,  1020040k free,   577096k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
12691 rpc       15   0  9896 2444  484 S  0.0  0.2   0:00.51 portmap            

#

4. run rpcinfo -p or another 'nc -d localhost 111 &' and check the portmap CPU
usage again:

# nc -d localhost 111 &

# top -b -n 1 -p `pidof portmap`
top - 11:44:51 up  3:51,  3 users,  load average: 0.96, 0.61, 0.44
Tasks:   1 total,   1 running,   0 sleeping,   0 stopped,   0 zombie
Cpu(s): 20.7%us, 12.5%sy,  0.6%ni, 60.7%id,  4.9%wa,  0.5%hi,  0.2%si,  0.0%st
Mem:   1001616k total,   982068k used,    19548k free,    27608k buffers
Swap:  1020116k total,       76k used,  1020040k free,   577096k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
12691 rpc       25   0  9896 2444  484 R 99.8  0.2   2:47.83 portmap


The problem is caused by the fact that accept syscall returns before a code
responsible for dequeuing connections from socket, which leads to returning
from poll() every time it gets called in svc_run(). The high CPU usage is
caused by a loop in svc_run() which gets called after every poll() return where
the whole array of active sockets gets copied to an array of sockets to be
watched by poll(), then there is another loop which goes through the array of
all active sockets watched by poll() again to find out any event on them and
because the accept() returns EMFILE error and no handle code gets called after
it, it happens multiple times in short interval -> high cpu usage


The problem could be solved by waiting for a very short time in case the
accept() would return EMFILE. The attached temporary patch does that and it
leads to very low CPU usage. The patch fixes the problem for tcp, udp and unix
socket connections.
Comment 1 law 2012-11-28 21:17:12 UTC
Fixed via:
commit 14bc93a967e62abf8cf2704725b6f76619399f83
Author: Jeff Law <law@redhat.com>
Date:   Wed Nov 28 14:12:28 2012 -0700

           [BZ #14889]
            * sunrpc/rpc/svc.h (__svc_accept_failed): New prototype.
            * sunrpc/svc.c: Include time.h.
            (__svc_accept_failed): New function.
            * sunrpc/svc_tcp.c (rendezvous_request): If the accept fails for
            any reason other than EINTR, call __svc_accept_failed.
            * sunrpc/svc_udp.c (svcudp_recv): Similarly.
            * sunrpc/svc_unix.c (rendezvous_request): Similarly.