This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: oprofile's mechanism to get file path information
- From: Mike Mason <mmlnx at us dot ibm dot com>
- To: William Cohen <wcohen at redhat dot com>
- Cc: SystemTAP <systemtap at sources dot redhat dot com>
- Date: Sun, 05 Aug 2007 10:12:03 -0700
- Subject: Re: oprofile's mechanism to get file path information
- References: <46B22C2A.1070002@redhat.com>
Some elements of this might be useful for our purposes, but I can think of a few drawbacks:
- Delaying retrieval of the pathname means we can't filter on the pathname in a probe.
- This approach retrieves the pathname from user space via a system call. We don't have that option in the current design of SystemTap.
- This approach calls d_path() in the lookup_dcookie() system call, but from the context of task doing the lookup, not from the context of the task when the data was gathered. The tasks root directory is part of the pathname and, at least theoretically, it can be different, thus making the pathname different. I don't know how much root directories change in real life, but it's possible.
Still, this is an interesting idea and one we should explore for this and other purposes.
Thanks,
Mike
William Cohen wrote:
There was some discussion at today's SystemTap meeting on getting the
file path information for VFS tapset. The problem is that the functions
in getting the path require some locks, and in general want to avoid
getting locks when in a probe. It was mentioned that OProfile has a
mechanism to get file path information (dcookies). OProfile's mechanism
may not be entirely appropriate, but it have some similar issues.
OProfile has a interrupt mechanisum that does the actual sampling. On
x86_64 and i386 machines this is done as a non-maskable interrupt (NMI).
As a result what can be done in the interrupt context is very limited.
The interrupt mechanism just records the context that the interrupt
occurred in, the linear address of the program counter, and the
performance counter that caused the sample. OProfile records this
information in per processor circular queues. This is done to eliminate
the need for any locks. The linear address is of limited use because
linear address is very ephemeral, different programs may map the same
shared library to different locations. OProfile converts the linear
address into a file and an offset into the file. This conversion happens
when the data from the per processor buffers is collected into a
system-wide buffer.
Having arbitrary length strings in the buffer sent into user space is
awkward. OProfile uses the dcookie mechanism to use fixed size integer
numbers for the file path. The daemon in userspace can make a systemcall
to convert the number back into a string. This makes the data format
much more compact. It doesn't need to pass all large strings around; the
user-space daemon only needs to do the dcookie lookup if it hasn't seen
the dcookie value before. This user-space code in
oprofile/daemon/opd_cookie.c does the operation. The kernel side of the
code is in sys_lookup_dcookie, in linux/fs/dcookies.c. There is some
code in linux/driver/oprofile/buffer_sync.c that is converting that
linear address into a filename and offset.
There is a dcookie_mutex for the dcookie stuff, so there is still some
locking. However, for oprofile this locking happens when the the buffers
are being read out (less time critical) or in the user-space is trying
to get a string name rather than when the sample is actually being
collected.
Maybe some of the approach used in dcookies would be useful for VFS path
names.
-Will