iostat for scsi devices

Problem

Linux doesn't provide I/O statistics for tape devices for printing by the iostat command. Rather than beg your vendor to patch your kernel to do this, let systemtap do some digging.

Scripts

iostat-scsi.stp iostat-scsi-rhel4.stp

global devices, reads, writes

/* data collection: SCSI disk */
probe module("sd_mod").function("sd_init_command") {
  device=kernel_string($SCpnt->request->rq_disk->disk_name)
  sector_size=$SCpnt->device->sector_size
  nr_sectors=$SCpnt->request->nr_sectors
  devices[device] = 1
  if ($SCpnt->request->flags /* cmd_flags on some kernels */ & 1)
    writes[device] <<< nr_sectors * sector_size
  else
    reads[device] <<< nr_sectors * sector_size
}
/* data collection: SCSI tape */
probe module("st").function("st_do_scsi") {
  device=kernel_string($STp->disk->disk_name)
  devices[device] = 1
  if ($direction)
    writes[device] <<< $bytes
  else
    reads[device] <<< $bytes
}


/* reporting */
global blksize=512
global hdrcount
probe timer.s($1) {
  if ((hdrcount++ % 10) == 0)
    printf("%9s %9s %9s %9s %9s %9s\n",
           "Device:","tps","blk_read/s","blk_wrtn/s","blk_read","blk_wrtn")

  foreach (dev in devices) {
    rdcount=@count(reads[dev])
    wrcount=@count(writes[dev])
    tps=(rdcount+wrcount)*100/$1
    rdblkcount=rdcount ? @sum(reads[dev])/blksize : 0
    wrblkcount=wrcount ? @sum(writes[dev])/blksize : 0
    rdblkrate=rdblkcount*100/$1
    wrblkrate=wrblkcount*100/$1
    printf("%9s %6d.%02d %6d.%02d %6d.%02d %9d %9d\n",
        dev, tps/100,tps%100,
        rdblkrate/100,rdblkrate%100,
        wrblkrate/100,wrblkrate%100,
        rdblkcount, wrblkcount)
  }
  printf ("\n")
  delete devices
  delete reads
  delete writes
}

Output

# stap iostat-scsi.stp 10 # seconds between reports
  Device:       tps blk_read/s blk_wrtn/s  blk_read  blk_wrtn
      st1      0.00      0.00      0.00         0         0
      sda      0.00      0.00      0.00         0         0
      st2     35.00      0.00 143360.00         0    143360

      st1      0.00      0.00      0.00         0         0
      sda      0.00      0.00      0.00         0         0
      st2     23.00      0.00  94208.00         0     94208

      st1      0.00      0.00      0.00         0         0
      sda      3.00    224.00      0.00       224         0
      st2     27.00      0.00 110592.00         0    110592

      st1      0.00      0.00      0.00         0         0
      sda      0.00      0.00      0.00         0         0
      st2     37.00      0.00 151552.00         0    151552

      st1      0.00      0.00      0.00         0         0
      sda      0.00      0.00      0.00         0         0
      st2     47.00      0.00 192512.00         0    192512

      st1      0.00      0.00      0.00         0         0
      sda      0.00      0.00      0.00         0         0
      st2     47.00      0.00 192512.00         0    192512

      st1      0.00      0.00      0.00         0         0
      sda      0.00      0.00      0.00         0         0
      st2     46.00      0.00 188416.00         0    188416

Lessons

The hard part is identifying the probe points, and identifying the values to be extracted at each point. But this should be a one-time effort, by putting this part of the code into the SCSI tapset.

Formatting the results is made a bit clumsy by the scaled-decimal representation. The printf code should probably support this better with a built-in that performs the scaling/modulo calculations internally.

Note how the data gathering blocks are separate from each other and from the formatting code. One can add similar monitoring probes for other device drivers without changing the rest of the code. Systemtap scripts may refer to modules that are not actually loaded (and these probes are silently skipped). This means that a single big master script could be constructed for a slew of devices that would work on all machines of a given generation. (This even includes reusability the compiled .ko form of the probe!)

A version is also provided above that will work under RHEL 4's older version of Systemtap. This version requires a parameter that is measured in milliseconds, rather than seconds.


WarStories

None: WSiostatSCSI (last edited 2008-01-10 19:47:35 by localhost)