This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Possible War Story for Wiki
- From: William Cohen <wcohen at redhat dot com>
- To: SystemTAP <systemtap at sources dot redhat dot com>
- Date: Tue, 30 Jan 2007 12:56:41 -0500
- Subject: Possible War Story for Wiki
I am curious why Firefox takes so long to start up. The disc drive
seems to be very active based on the sound of the drive and the drive
activity light. There are a huge number of files being opened, some of
them multiple times. Why does firefox some of the jar files 20 and 40 times?
There is a small example SystemTap script, iotime.stp, that
accumulates the time spent in the reads and writes systemcalls and
prints out the file name and the accumlated time each time the file is
closed. I ran firefox on an relatively idle ThinkPad T41 laptop running fc6
with:
stap iotime.stp -c "firefox" >& firefox20070129.log
Once firefox was up I immediately selected the exit menu item to
shutdown firefox. The SystemTap script exits when firefox exits. An
example entry in the log file look like the following:
739409 3767 (firefox-bin) access /usr/lib/libpango-1.0.so.0 read: 512 write: 0
739419 3767 (firefox-bin) iotime /usr/lib/libpango-1.0.so.0 time: 12
The first item on the line is a timestamp in microseconds, the second
is the pid, and the third is the executable. The forth item is the
information type: either "access" or "iotime". The first line is
"access" indicates the number of bytes read or written when the file
is closed. The "iotime" line is the accumumated wall-clock time in
microseconds spent in read and write system calls. It is pretty easy
to use awk and other tools to extract interesting features from the
log. There are a lot of files (484) opened and closed by firefox-bin,
the main program for firefox:
$ grep iotime firefox20070129.log |grep "(firefox-bin)" | grep iotime|wc
484 3388 43308
There are a lot of different files (371) being touched:
$ grep iotime firefox20070129.log |grep "(firefox-bin)"| awk '{print
$5}'|sort|uniq -c > files20070129.txt
$ wc files20070129.txt
371 742 19874 files20070129.txt
A break down of the types of files being opened:
Shared libraries ".so" 139
Manifest files ".manifest" 44
Jar files ".jar" 8
Xml files ".xml" 2
User files 24
One of the things that I didn't understand is why firefox opens some
of the jar files so many time:
$ sort -nr files20070129.txt |more
40 /usr/lib/firefox-1.5.0.9/chrome/classic.jar
21 /usr/lib/firefox-1.5.0.9/chrome/toolkit.jar
19 /usr/lib/firefox-1.5.0.9/chrome/en-US.jar
12 /home/wcohen/.mozilla/firefox/2feaxu66.default/extensions.ini
5 /usr/lib/firefox-1.5.0.9/extensions/{fda00e13-8c62-4f63-9d19-d168115b11c
a}/chrome/beagle.jar
5 /usr/lib/firefox-1.5.0.9/chrome/browser.jar
3 /usr/lib/firefox-1.5.0.9/chrome/icons/default/default.xpm
...
Sorting based on the accumulated wall-clock times shows that there are
many files where 10's of milliseconds are spent in the systemcall:
$ grep iotime firefox20070129.log |grep "(firefox-bin)" | grep iotime |sort
-nrk 7|more
10849905 3767 (firefox-bin) iotime /home/wcohen/.mozilla/firefox/2feaxu66.defaul
t/XUL.mfasl time: 117808
5057162 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/components/accessibil
ity.xpt time: 64857
4627460 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/chrome/classic.jar ti
me: 39811
5105438 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/chrome/classic.jar ti
me: 36892
5984490 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/chrome/en-US.jar time
: 26664
10840677 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/chrome/classic.jar t
ime: 24797
4416512 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/components/nsDefaultC
LH.js time: 23835
10840556 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/chrome/en-US.jar tim
e: 23089
5697462 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/chrome/browser.jar ti
me: 23066
5578128 3767 (firefox-bin) iotime /usr/lib/libnssckbi.so time: 22914
10840640 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/chrome/browser.jar t
ime: 22358
10677359 3767 (firefox-bin) iotime /usr/lib/firefox-1.5.0.9/components/appstartu
p.xpt time: 21492
/*
* Copyright (C) 2006 Daniel Berrange, Red Hat Inc.
* Copyright (C) 2007 Will Cohen, Red Hat, Inc.
*
* This copyrighted material is made available to anyone wishing to use,
* modify, copy, or redistribute it subject to the terms and conditions
* of the GNU General Public License v.2.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software Foundation,
* Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*
* Print out the amount of time spent in the read and write systemcall
* when a process closes each file is closed. Note that the systemtap
* script needs to be running before the open operations occur for
* the script to record data.
*
* This script could be used to to find out which files are slow to load
* on a machine. e.g.
*
* stap iotime.stp -c 'firefox'
*
* Output format is:
* timestamp pid (executabable) info_type path ...
*
* 200283135 2573 (cupsd) access /etc/printcap read: 0 write: 7063
* 200283143 2573 (cupsd) iotime /etc/printcap time: 69
*
*/
global start
global entry_io
global fd_io
global time_io
function timestamp:long() {
return gettimeofday_us() - start
}
function proc:string() {
return sprintf("%d (%s)", pid(), execname())
}
probe begin {
start = gettimeofday_us()
}
global filenames
global filehandles
global fileread
global filewrite
probe syscall.open {
filenames[pid()] = user_string($filename)
}
probe syscall.open.return {
if ($return != -1) {
filehandles[pid(), $return] = filenames[pid()]
fileread[pid(), $return] = 0
filewrite[pid(), $return] = 0
} else {
printf("%d %s access %s fail\n", timestamp(), proc(), filenames[pid()])
}
delete filenames[pid()]
}
probe syscall.read {
if ($count > 0) {
fileread[pid(), $fd] += $count
}
t = gettimeofday_us(); p = pid()
entry_io[p] = t
fd_io[p] = $fd
}
probe syscall.read.return {
t = gettimeofday_us(); p = pid()
fd = fd_io[p]
time_io[p,fd] <<< t - entry_io[p]
}
probe syscall.write {
if ($count > 0) {
filewrite[pid(), $fd] += $count
}
t = gettimeofday_us(); p = pid()
entry_io[p] = t
fd_io[p] = $fd
}
probe syscall.write.return {
t = gettimeofday_us(); p = pid()
fd = fd_io[p]
time_io[p,fd] <<< t - entry_io[p]
}
probe syscall.close {
if (filehandles[pid(), $fd] != "") {
printf("%d %s access %s read: %d write: %d\n", timestamp(), proc(),
filehandles[pid(), $fd], fileread[pid(), $fd], filewrite[pid(), $fd])
if (@count(time_io[pid(), $fd]))
printf("%d %s iotime %s time: %d\n", timestamp(), proc(),
filehandles[pid(), $fd], @sum(time_io[pid(), $fd]))
}
delete fileread[pid(), $fd]
delete filewrite[pid(), $fd]
delete filehandles[pid(), $fd]
delete fd_io[pid()]
delete entry_io[pid()]
delete time_io[pid(),$fd]
}