This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug nscd/24949] New: High CPU load by nscd service
- From: "oleg_shishlyannikov at epam dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Thu, 29 Aug 2019 16:07:02 +0000
- Subject: [Bug nscd/24949] New: High CPU load by nscd service
- Auto-submitted: auto-generated
https://sourceware.org/bugzilla/show_bug.cgi?id=24949
Bug ID: 24949
Summary: High CPU load by nscd service
Product: glibc
Version: 2.17
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: nscd
Assignee: unassigned at sourceware dot org
Reporter: oleg_shishlyannikov at epam dot com
CC: drepper.fsp at gmail dot com
Target Milestone: ---
Created attachment 11967
--> https://sourceware.org/bugzilla/attachment.cgi?id=11967&action=edit
Proposed fix
Hi,
Below are some technical detail and a proposed patch with a fix to a nscd bug
recently observed by one of our users:
Environment: nscd on CentOS and RHEL 7, plus SELinux enabled
(glibc-2.17-260.el7 and later).
This combination starts causing high CPU load (100% and more) very quickly.
Analysis shows there is the following problem in nscd:
A linked list with a cycle in nscd global structure array called dbs.
It has "traced_files *" field (singly linked list), which causes eating of
CPU time (up to 100% or more) during pass through the list in a while loop of
"prune_cache" function.
What is root cause?
File nss/nsswitch.c has functions which are responsible for loading shared
objects and calling the init function.
If the init function is invoked two or more times for one shared object,
we'll get looped list of traced files.
The function "dlopen" return the same handle after second loading of shared
object (it is libnss_files.so in our case).
But after each loading of shared object nsswitch invokes the init function.
So, pointers to structures "traced_file " which are placed in
"nss/nss_files/files-init.c" will be queued to "traced_files " also twice.
Attached is nscd.gdb.log which shows the issue.
# service nscd restart && nice gdb -p $(pidof nscd) -x nscd.gdb && echo
"Done";
File nscd.gdb:
set pagination off
set logging file nscd.gdb.log
set logging on
handle SIGPIPE nostop noprint pass
set follow-fork-mode child
set detach-on-fork off
watch 'nss_files/files-init.c'::resolv_traced_file->file->next
watch 'nss_files/files-init.c'::hst_traced_file->file->next
print dbs
print 'nss_files/files-init.c'::resolv_traced_file->file
print 'nss_files/files-init.c'::hst_traced_file->file
continue
print dbs
print 'nss_files/files-init.c'::resolv_traced_file->file
print 'nss_files/files-init.c'::hst_traced_file->file
info threads
list
backtrace
quit
To avoid this situation, I propose to add check for duplicates to
"register_traced_file" function. Patch attached.
--
You are receiving this mail because:
You are on the CC list for the bug.