This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.
Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Doing some more careful performance analysis with libunwind, I'm finding that the dl_iterate_phdr() call needed to verify that the phdr-list didn't change is rather expensive. Specifically, the time needed to initialize an "unwind cursor" (via unw_init_local()) is as follows: without dl_iterate_phdr() callback (unsafe): 140 ns with dl_iterate_phdr(), without -lpthread: 200 ns with dl_iterate_phdr(), with -lpthread: 300 ns This is rather more than I expected and a slow-down of more than a factor of 2 for multi-threaded apps is bit more than I'm willing to bear since it could really affect the usability of libunwind for things such as allocation tracking or stack-trace-based sampling. The profile for the "without -lpthread" case looks like this: % time self cumul calls self/call tot/call name 60.44 13.05 13.05 101M 129n 204n _ULia64_init_local 17.19 3.71 16.76 99.6M 37.3n 66.7n dl_iterate_phdr 4.65 1.00 17.76 101M 9.97n 9.97n rtld_lock_default_lock_recursive 4.64 1.00 18.77 99.0M 10.1n 10.1n rtld_lock_default_unlock_recursive The profile for the "with -lpthread" case looks like this (this was measured on a different machine, so the total time of 223 ns is not comparable to the 300 ns mentioned above; the relative times are fine, though): % time self cumul calls self/call tot/call name 47.93 11.25 11.25 99.6M 113n 223n _ULia64_init_local 18.35 4.31 15.56 100M 43.0n 43.0n pthread_mutex_lock 11.81 2.77 18.33 100M 27.7n 27.7n __pthread_mutex_unlock_usercnt 11.65 2.73 21.06 100M 27.3n 103n __GI___dl_iterate_phdr For brevity, I didn't include the call-graphs, but they are pretty easy: all calls to dl_iterate_phdr() are indirectly due to the cache-validation done by _ULia64_init_local() and almost all lock-related calls are due to dl_iterate_phdr(). I suppose I could add a libunwind-hack to disable cache-validation, but that seems like a step backward since it would make caching unsafe again. In case it matters, the first profile was obtained with libc v2.3.2 and the second profile was obtained with the CVS libc (as of a few days ago). Can this be improved? --david
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |