Accuracy of disk statistics I/O counter

Problem

(Transcribed from this mailing list message.)

The problem was a mismatch between expected and actual I/O counts/sizes resulting from a microbenchmark load.

Scripts

global rqs
global lun, id, channel, host_no

probe begin
{
        host_no = 0
        channel = 0
        id = 7
        lun = 12
}

probe module("*scsi_mod*").function("scsi_dispatch_cmd")
{
        if (1       != $cmd->sc_data_direction) next
        if (lun     != $cmd->device->lun) next
        if (id      != $cmd->device->id) next
        if (channel != $cmd->device->channel) next
        if (host_no != $cmd->device->host->host_no) next

        rqs[$cmd->request_bufflen / 1024]++
}

probe end
{
        foreach (rec+ in rqs)
              printf("%d %d\n", rec, rqs[rec])
        exit()
}

Lessons

Sometimes the standard statistics provided by the kernel do not correspond to actual low-level activities. "SystemTAP really rocks for this type of analysis!"


WarStories

None: WSDiskIOAccuracy (last edited 2008-01-10 19:47:26 by localhost)