This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Fwd: Re: Accuracy of disk statistics IO counter]


Here's a "success story" with SystemTAP: a developer here was having trouble with getting iostat results to match internal tools he was working on, and with the help of a _very_ simple SystemTAP script, he was able to determine the nature of the problem, and get going very quickly...

Alan
PS. Here's the script I started him with...

global rqs
global lun, id, channel, host_no

probe begin
{
       host_no = 0
       channel = 0
       id = 7
       lun = 12
}

probe module("*scsi_mod*").function("scsi_dispatch_cmd")
{
       if (1       != $cmd->sc_data_direction) next
       if (lun     != $cmd->device->lun) next
       if (id      != $cmd->device->id) next
       if (channel != $cmd->device->channel) next
       if (host_no != $cmd->device->host->host_no) next

       rqs[$cmd->request_bufflen / 1024]++
}

probe end
{
       foreach (rec+ in rqs)
             printf("%d %d\n", rec, rqs[rec])
       exit()
}
--- Begin Message ---
Specifically, I wrote a 1GB file with a blocksize of 1MB, which would result in 1000 writes at the application level. What I believe then happens is that each write turns into 8 128KB requests to the driver, which should result in 8000 actual writes. Toss in metadata operations and who knows what else and the actual number should be a little higher. What I've see after repeating the tests a number of times on 2.6.16-27 is numbers ranging from 6800-7000 writes which feels like a big enough difference to at least point out.



Install http://brick.kernel.dk/snaps/blktrace-git-20060723022503.tar.gz
and blktrace your disk for the duration of the test and compare the io
numbers. Requires 2.6.17 or later, though.


I had problems getting blktrace going and posted a note to linux-btrace@vger.kernel.org at Alan Brunelle's suggestion and he also said he'd take a closer look at it himself. On the other hand he pointed me to 'stap' and I was able to use it to get the details I was looking for - SystemTAP really rocks for this type of analysis! As it turns out my assumption about driver blocksize was wrong (sorry about that). It turns out that the size of requests from the driver to the disk is 160mb and so the I/O count was smaller than I had anticipated.
-mark




--- End Message ---

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]