Bug 17764 - Excessive seeking when reading debug data pegs the CPU @100%
Summary: Excessive seeking when reading debug data pegs the CPU @100%
Status: RESOLVED DUPLICATE of bug 17677
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: 7.8
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-27 15:45 UTC by Sam Varshavchik
Modified: 2014-12-30 16:27 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Profiled gdb run (201.17 KB, text/plain)
2014-12-27 18:27 UTC, Sam Varshavchik
Details
Profiled gdb run after applying 6a8939044781191d685e0aaeb78fe34a97ee3d55 (201.08 KB, text/plain)
2014-12-28 00:08 UTC, Sam Varshavchik
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Varshavchik 2014-12-27 15:45:55 UTC
GNU gdb (GDB) Fedora 7.8.1-30.fc21 spins at 100% cpu for over a minute, trying to swallow a shared library:

Attaching to program: /proc/7919/exe, process 7919
Reading symbols from /home/mrsam/src/libcxxw-work/.libs/libcxxw.so.0...

[ ... about a minute later ... ] done.
Loaded symbols for /home/mrsam/src/libcxxw-work/.libs/libcxxw.so.0

Eventually gdb finishes loading, and everything's fine past that point. Looking at the strace, for every 4096 bytes worth of, what's apparently the debug data, gdb then proceed and seek to the same file position, exactly 256 times before it reads the next 4096 bytes.

Seems like there 'ought to be a way to make this go faster.

It's a fairly big library, so it's understandable that it will take some time to swallow this:

[mrsam@octopus libcxxw-work]$ size .libs/libcxxw.so.0.0.0
   text	   data	    bss	    dec	    hex	filename
14018479	 411936	   3608	14434023	 dc3ee7	.libs/libcxxw.so.0.0.0
[mrsam@octopus libcxxw-work]$ ls -l .libs/libcxxw.so.0.0.0
-rwxr-xr-x 1 mrsam mrsam 60242156 Dec 27 10:27 .libs/libcxxw.so.0.0.0

But when gdb tries to read this, it's burning up the CPU, without much cause, like this:

[256 seeks to the same file position total ]

lseek(8, 8781824, SEEK_SET)             = 8781824
lseek(8, 8781824, SEEK_SET)             = 8781824
lseek(8, 8781824, SEEK_SET)             = 8781824
read(8, "\377%B\350 some boring debug data"..., 4096) = 4096
lseek(8, 8785920, SEEK_SET)             = 8785920
lseek(8, 8785920, SEEK_SET)             = 8785920
lseek(8, 8785920, SEEK_SET)             = 8785920

[ ... 256 more seeks to another file position, total ... ]

lseek(8, 8785920, SEEK_SET)             = 8785920
lseek(8, 8785920, SEEK_SET)             = 8785920
read(8, "\377%B\340u more boring debug data"..., 4096) = 4096
lseek(8, 8790016, SEEK_SET)             = 8790016
lseek(8, 8790016, SEEK_SET)             = 8790016
lseek(8, 8790016, SEEK_SET)             = 8790016

[ another 256 seeks, and like that, for another minute or so... ]
Comment 1 Jan Kratochvil 2014-12-27 15:52:56 UTC
Try to run /usr/bin/gdb-add-index on your build before benchmarking anything.  Performance issues in files without .gdb_index are in fact unsupported/unsupportable.  How to make it automatic is another question, gold can do that on each link but Fedora is still using GNU ld.

If it is slow even with .gdb_index (from gdb-add-index) then I need rather perf or gprof output.  Those lseek()s are mostly harmless, that is a bug in glibc that each fseek() needlessly propagates as lseek().
Comment 2 dje 2014-12-27 16:21:44 UTC
Yeah.  All those seeks are annoying, but are they really the main cause of the slow down?

Also, I wouldn't say the non-.gdb_index case is not supportable.
It *is* going to be slower, but if there are fixable perf issues we may fix them
(emphasis on "may").

If the cpu is pegged at 100% and in this case debug info reading is cpu bound and not disk bound, how big is the shared library?
If it is relatively big, .gdb_index is definitely the way to go if you want a speedup.
Comment 3 Sam Varshavchik 2014-12-27 17:08:59 UTC
The .so's total size is 60 megabytes. readelf shows all the debug data starting at about the 14 megabyte mark, so, say around 46 megs of debug data, give or take a meg.

I monitored the virtual memory size of the gdb process itself, while it was spinning. gdb was not allocating an appreciable amount of memory.

gdb-add-index grew the .so to 70 megs in size, but made no visible difference -- gdb still pegged the CPU for over a minute at startup.

One thing I've noticed is that gdb seems to be rereading the same debug data over and over again.

If I grep just the seek offsets, the remove the consecutive dupes only, then sort the deduped list again, I still find that the gdb process often re-seeks back to the same offset, something like this:

grep lseek | awk ' { print $2 } ' | uniq | sort | uniq -c

Partial results from this:

    926 8769536,
    926 8773632,
    926 8777728,
    925 8781824,
    888 8785920,
    632 8790016,
    376 8794112,
    120 8798208,

So, it seems like after reading from seek offset 8769536, for example, the gdb process then seeked somewhere else, but at some point went back and reread it again. Almost a thousand times. This is with the gdb-add-index data in the .so; but, again, that took about the same amount of time as without it.
Comment 4 dje 2014-12-27 17:43:10 UTC
Huh.
46 megs of debug info is not that much (my monster benchmark has ~1.5G of debug info, with a 350M .gdb_index section, and "gdb foo" takes 13 seconds to get to the command prompt, 12 of which is spent reading the ELF symbol table, and 10 of that is spent demangling ELF symbols - which is to say ~1 second is spent on debug info).
Without .gdb_index it takes about 55 seconds (this is from memory so I may be off by a little, but not by much - I look at this issue a lot).

Plus gdb doesn't really "read" debug info in the .gdb_index case.
It's basically just an mmap() plus a bit of time spent processing the address table.
So it's interesting that .gdb_index doesn't make a difference.

It would be interesting to see gprof data for gdb.

How easy would it to recompile gdb with '-O2 -pg',
write a small script to exercise the bug,
run gdb --batch -x the-script,
and then send us the output of "gprof gdb gmon.out"?
To recompile gdb, one could just cd into the gdb directory, rm *.o, and do "make CFLAGS='-O2 -pg'".

Another idea, if it's taking a minute (so you've got time to do this), is to to attach to the running gdb from a parent gdb, then do repeated "bt, continue, ctrl-c", say every 5 seconds.  That's a quick hack way of collecting some data.
Comment 5 Sam Varshavchik 2014-12-27 18:27:44 UTC
Created attachment 8031 [details]
Profiled gdb run

Without having looked at gdb's internals, just based on the symbols-versus-call count I see in this profiling output, C++ demangling appears to be the problem. I'm going to guess that the demangler is hammering on the symbol table.
Comment 6 Jan Kratochvil 2014-12-27 18:44:03 UTC
From the Comment 5 it seems to me a backport of PR binutils/17677 should fix it.
Could you test that fix or should I prepare a new Fedora GDB build with it?
It would be good to backport it anyway.
Comment 7 Jan Kratochvil 2014-12-27 18:45:08 UTC
Testing current upstream trunk may be easier:
  https://www.gnu.org/software/gdb/current/
(although it may be slower due to some other pending patch)
Comment 8 Sam Varshavchik 2014-12-28 00:08:37 UTC
Created attachment 8032 [details]
Profiled gdb run after applying 6a8939044781191d685e0aaeb78fe34a97ee3d55

That's the bug. After applying 6a8939044781191d685e0aaeb78fe34a97ee3d55, which applies cleanly to the rpm, gdb racks up only 6 seconds of 100% CPU at startup, on the same indexed .so, versus more than a minute.
Comment 9 Jan Kratochvil 2014-12-28 14:40:29 UTC
Thanks, new Fedora build is at:
  https://admin.fedoraproject.org/updates/gdb-7.8.1-36.fc21

*** This bug has been marked as a duplicate of bug 17677 ***
Comment 10 H.J. Lu 2014-12-29 15:53:26 UTC
(In reply to Sam Varshavchik from comment #8)
> Created attachment 8032 [details]
> Profiled gdb run after applying 6a8939044781191d685e0aaeb78fe34a97ee3d55
> 
> That's the bug. After applying 6a8939044781191d685e0aaeb78fe34a97ee3d55,
> which applies cleanly to the rpm, gdb racks up only 6 seconds of 100% CPU at
> startup, on the same indexed .so, versus more than a minute.

Can you do me a favor to compare gdb startup time on the same binary
against gdb 7.7 in Fedora 20?  Thanks.
Comment 11 Sam Varshavchik 2014-12-29 23:43:52 UTC
I no longer have F20 machines, unfortunately.
Comment 12 H.J. Lu 2014-12-29 23:53:05 UTC
(In reply to Sam Varshavchik from comment #11)
> I no longer have F20 machines, unfortunately.

Can you try:

# cd /tmp
# wget "https://dl.fedoraproject.org/pub/fedora/linux/updates/20/x86_64/gdb-7.7.1-21.fc20.x86_64.rpm"
# rpm2cpio gdb-7.7.1-21.fc20.x86_64.rpm | cpio -ivdm ./usr/bin/gdb

and run ./usr/bin/gdb?
Comment 13 Sam Varshavchik 2014-12-30 01:23:05 UTC
Looks ok.. Takes about three seconds to process the same shared library.
Comment 14 H.J. Lu 2014-12-30 16:27:46 UTC
(In reply to Sam Varshavchik from comment #13)
> Looks ok.. Takes about three seconds to process the same shared library.

Thanks.