This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Linux AMI 2011.09.1.x86_64-ebs issue: kernel aki-825ea7eb build-id munging
- From: "Frank Ch. Eigler" <fche at redhat dot com>
- To: gafton at amazon dot com
- Cc: systemtap at sourceware dot org
- Date: Thu, 27 Oct 2011 16:55:57 -0400
- Subject: Linux AMI 2011.09.1.x86_64-ebs issue: kernel aki-825ea7eb build-id munging
Hi again, Christian -
I'm back for some more EC2 systemtap testing. You kindly fixed one packaging
problem with kernel-debuginfo back in August, if you recall. It turns out we
have a new problem; this one related to build-ids. systemtap uses ELF
build-id notes in order to verify version matching between the running kernel
and one whose ELF/DWARF files it's reading. On the current default AMI
kernel (2.6.35.14-95.38.amzn1.x86_64), there is a mismatch.
One can see this by hex-dumping /sys/kernel/notes on a running instance,
and contrasting it with
% readelf -x .notes /usr/lib/debug/lib/modules/`uname -r`/vmlinux
from the corresponding debuginfo. The last bunch of bytes are supposed
to be identical.
The build-id is getting corrupted at some point during the packaging process.
This precludes systemtap operation:
sudo stap -e 'probe kernel.function("sys_open"){}' -tv
Pass 1: parsed user script and 76 library script(s) using 96240virt/21920res/2788shr kb, in 130usr/20sys/151real ms.
Pass 2: analyzed script: 1 probe(s), 0 function(s), 0 embed(s), 0 global(s) using 196460virt/86980res/51868shr kb, in 270usr/140sys/414real ms.
Pass 3: translated to C into "/tmp/stapVRU8p1/stap_5789e459df56e64ee93d1b2d5fe74936_758.c" using 196460virt/87868res/52756shr kb, in 280usr/10sys/294real ms.
Pass 4: compiled C into "stap_5789e459df56e64ee93d1b2d5fe74936_758.ko" in 4380usr/1590sys/6490real ms.
Pass 5: starting run.
ERROR: Build-id mismatch: "kernel" vs. "vmlinux" byte 0 (0x7c vs 0x01) address 0xffffffff813218f4 rc 0
I seem to recall a kernel makefile (or perhaps elfutils) problem that
resulted in a problem like this before. IIRC, it was some sort of problem
during the vmlinux debuginfo stripping stage. Unfortunately, I can't find
a link to the fix of the actual problem.
cc:'ing our team to see if someone's memories can be jogged.
- FChE