Monitoring Linux Boot Process

Problem

Would like to use run a SystemTap probe to see what the machine doing when it boots up like Bootchart, but allow more flexibility.

Techniques to create a SystemTap probe that runs at boot time were created for the One Laptop Per Child project, http://people.redhat.com/berrange/systemtap/bootprobe/. However, going through the steps can be a tedious. The boottap script used in this wiki entry was created to automate the process of converting the SystemTap script into a module and making a grub entry that can be selected when booting the machine. The boottap script also has an option to remove the script.

A ThinkPad T41 (a 1.6GHz Pentium M processors) running FC6 and the 2.6.18-1.2869.fc6 kernel, was setup to do some experiments to look at the bootup process. A simple script, iotime.stp, recorded the amount of wallclock time spent in reads and writes system calls was run. The script also recorded the amount of data that a process read or wrote that a process wrote to a file. The script was installed using boottap script:

./boottap -c /boottap15 -e 'sleep 100' -s /home/wcohen/iotime.stp

The "-c" creates the /boottap15 directory with the needed initscript and kernel module. The "-e" is an optional exit condition that shutsdown the instrumentation when the function returns. The "-s" is followed by the SystemTap script. The machine was rebooted and the grub entry for the boottap probe was selected. Once the script exits there is data in /boottap15/data/initprocs.log. This can be copied and examined for further analysis.

http://www.bootchart.org/

Scripts

boottap iotime.stp

Output

455 358 (sh) access /etc/ld.so.cache read: 0 write: 0
529 358 (sh) access /lib/libtermcap.so.2 read: 512 write: 0
533 358 (sh) iotime /lib/libtermcap.so.2 time: 5
586 358 (sh) access /lib/libdl.so.2 read: 512 write: 0
589 358 (sh) iotime /lib/libdl.so.2 time: 4
638 358 (sh) access /lib/libc.so.6 read: 512 write: 0
642 358 (sh) iotime /lib/libc.so.6 time: 3
...

Lessons

One of the things that became clear when looking through the output was that modprobe does a linear search through the 250K modules.dep file each time it installs a module. The list is unordered and some of the seraches can require most of the list to be searched.

A very simple script was written to pre-pend the list of modules found with lsmod at the beginning of the file.

#/bin/sh
#
# A simple little script to generate prepend the list of modules 
# dependencies to the beginning of the list. This is a simple
# attempt to speed up the modprobe on bootup.
#

MODULES_DEP=/lib/modules/`uname -r`/modules.dep

#find the list of modules currently installed
used_modules=`/sbin/lsmod|awk  '{print $1}' | tr "_" "."`

for m in $used_modules; do
        grep "$m\.ko:" $MODULES_DEP
done

cat  $MODULES_DEP

Installing the revised modules.dep file reduced the amount of /lib/modules/2.6.18-1.2869.fc6/modules.dep data read (and processed) by modprobe from 28.1MB to 20MB, as computed with a one-line script similar to the one below. The wallclock time spent in read systemtcalls for modules.dep dropped from 8.5seconds to 6.9seconds.

grep modules.dep normal-iotime20070126.log |grep access |awk 'BEGIN { rsum = 0 } { rsum += $7 } END { print rsum }'

The savings of this optimization was modest, only a couple of seconds out of nearly a minute for booting. However, it demonstrates the analysis of the Linux boot process and other systemtap scripts could be used to analyze other problems, e.g. which executables page fault.


WarStories

None: WSBootTap (last edited 2008-01-10 19:47:29 by localhost)