This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug nscd/24949] New: High CPU load by nscd service

From: "oleg_shishlyannikov at epam dot com" <sourceware-bugzilla at sourceware dot org>
To: glibc-bugs at sourceware dot org
Date: Thu, 29 Aug 2019 16:07:02 +0000
Subject: [Bug nscd/24949] New: High CPU load by nscd service
Auto-submitted: auto-generated

https://sourceware.org/bugzilla/show_bug.cgi?id=24949

            Bug ID: 24949
           Summary: High CPU load by nscd service
           Product: glibc
           Version: 2.17
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: nscd
          Assignee: unassigned at sourceware dot org
          Reporter: oleg_shishlyannikov at epam dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

Created attachment 11967
  --> https://sourceware.org/bugzilla/attachment.cgi?id=11967&action=edit
Proposed fix

Hi, 

Below are some technical detail and a proposed patch with a fix to a nscd bug
recently observed by one of our users:

Environment: nscd on CentOS and RHEL 7, plus SELinux enabled
(glibc-2.17-260.el7 and later).
This combination starts causing high CPU load (100% and more) very quickly.

Analysis shows there is the following problem in nscd:

  A linked list with a cycle in nscd global structure array called dbs.
  It has "traced_files *" field (singly linked list), which causes eating of
CPU time (up to 100% or more) during pass through the list in a while loop of
"prune_cache" function.

What is root cause?

  File nss/nsswitch.c has functions which are responsible for loading shared
objects and calling the init function.
  If the init function is invoked two or more times for one shared object,
we'll get looped list of traced files.
  The function "dlopen" return the same handle after second loading of shared
object (it is libnss_files.so in our case).
  But after each loading of shared object nsswitch invokes the init function.
So, pointers to structures "traced_file " which are placed in
"nss/nss_files/files-init.c" will be queued to "traced_files " also twice.

  Attached is nscd.gdb.log which shows the issue.

  # service nscd restart && nice gdb -p $(pidof nscd) -x nscd.gdb && echo
"Done";

  File nscd.gdb:

  set pagination off
  set logging file nscd.gdb.log
  set logging on
  handle SIGPIPE nostop noprint pass
  set follow-fork-mode child
  set detach-on-fork off
  watch 'nss_files/files-init.c'::resolv_traced_file->file->next
  watch 'nss_files/files-init.c'::hst_traced_file->file->next
  print dbs
  print 'nss_files/files-init.c'::resolv_traced_file->file
  print 'nss_files/files-init.c'::hst_traced_file->file
  continue
  print dbs
  print 'nss_files/files-init.c'::resolv_traced_file->file
  print 'nss_files/files-init.c'::hst_traced_file->file
  info threads
  list
  backtrace
  quit

To avoid this situation, I propose to add check for duplicates to
"register_traced_file" function. Patch attached.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Follow-Ups:
- [Bug nscd/24949] High CPU load by nscd service
  - From: oleg_shishlyannikov at epam dot com
- [Bug nscd/24949] High CPU load by nscd service
  - From: carlos at redhat dot com

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]