9937 – verify user-space build-ids

Bug 9937 - verify user-space build-ids

Summary: verify user-space build-ids

Status:	RESOLVED FIXED

Alias:	None

Product:	systemtap
Classification:	Unclassified
Component:	runtime (show other bugs)
Version:	unspecified

Importance:	P2 normal
Target Milestone:	---
Assignee:	Stan Cox

URL:
Keywords:

Depends on:
Blocks:	11179
	Show dependency tree / graph

Reported:	2009-03-10 12:23 UTC by Frank Ch. Eigler
Modified:	2010-11-08 15:10 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
Draft code for checking build-id (1.49 KB, patch) 2009-03-16 06:32 UTC, Wenji Huang	Details \| Diff
Updated patch (1.42 KB, patch) 2009-03-27 03:33 UTC, Wenji Huang	Details \| Diff
Updated patch -v3 (1.39 KB, patch) 2009-04-13 07:52 UTC, Wenji Huang	Details \| Diff
9937 patch (2.87 KB, patch) 2010-10-20 18:13 UTC, Stan Cox	Details \| Diff
user mode build-id patch (3.33 KB, patch) 2010-10-27 16:10 UTC, Stan Cox	Details \| Diff
Show Obsolete (2) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Frank Ch. Eigler 2009-03-10 12:23:30 UTC

Upon the appropriate callbacks from task_finder, we should verify that
user-space _stp_modules have a matching build-id.  At this point, we have the
info available but only verify it for kernel/modules.

Comment 1 Wenji Huang 2009-03-13 02:28:31 UTC

One problem is to find the matching "module" in _stp_modules array, available
info is name, build-ids, etc. But what we can have in callback from task_finder
is task_struct, stap_task_finder_target. One possible option is to compare
tsk->comm and module->name. But seems not exactly correct.

Comment 2 Mark Wielaard 2009-03-13 12:13:43 UTC

On the pr6866 branch I have some changes to task_finder.c that matches and
attaches a module to a vma entry. That is probably the place to hook in the
build-id check. I'll port it to the trunk master branch asap.

Comment 3 Mark Wielaard 2009-03-15 19:03:23 UTC

(In reply to comment #2)
> On the pr6866 branch I have some changes to task_finder.c that matches and
> attaches a module to a vma entry. That is probably the place to hook in the
> build-id check. I'll port it to the trunk master branch asap.

I did this:

commit bb64f40b58a64a9ae065dba5058463ac604c3896
Author: Mark Wielaard <mjw@redhat.com>
Date:   Sun Mar 15 15:29:01 2009 +0100

    Move vma module tracking from pr6866 branch to master.
    
    * tapsets.cxx (utrace_derived_probe_group::emit_module_decls):
      Always emit vm callback probe for __stp_tf_vm_cb.
    * runtime/task_finder.c (__stp_tf_vm_cb): Always expose, move _stp_dbug
      statements under ifdef DEBUG_TASK_FINDER_VMA. Find  and record
      corresponding module when vm_path not NULL.
    * runtime/task_finder_vma.c (struct __stp_tf_vma_entry): Add _stp_module.
      (stap_add_vma_map_info): Add _stp_module argument and assign.
      (__stp_tf_get_vma_entry_addr): New static function to get
      the __stp_tf_vma_entry given an address.

This only hooks for utrace, need to merge this method with the uprobes based
task_finder callbacks. Plus we need to make sure to always canonicalize the
module paths.

I haven't actually looked yet into how to fetch the build-id itself from the vma
and match it to what we stored in the module.

Comment 4 Wenji Huang 2009-03-16 06:32:13 UTC

Created attachment 3826 [details]
Draft code for checking build-id

This is the draft code for validating build-id. There are several iterms needed
to 
be confirmed.
1. Where the checking should happen,  especially after vma module tracking code
is visible by Mark.  Currently,  I put the check in stap_uprobe_process_found
or stap_uprobe_vmchange_found
2. Searching the found task in _stp_modules through tgt->pathname/vm_path.  Is
it enough?
3. Once build-id validation failed,  just return error? Any more cleanup need
to be done?
4. More consideration?

Comment 5 Mark Wielaard 2009-03-16 21:45:56 UTC

(In reply to comment #4)
> 1. Where the checking should happen,  especially after vma module tracking code
> is visible by Mark.  Currently,  I put the check in stap_uprobe_process_found
> or stap_uprobe_vmchange_found

Yes, this is where I would also hook in for tracking the vma->module map. I
don't think you need to hook stap_uprobe_process_found explicitly for an EXEC
finder.callback since the finder.vm_callback will give you also matches for the
EXEC itself. But maybe I am missing a timing issue.

> 2. Searching the found task in _stp_modules through tgt->pathname/vm_path.  Is
> it enough?

I believe this is enough, the vm_path is the complete canonical path. We do need
to make sure to register the module names also with their full canonical names.

> 3. Once build-id validation failed,  just return error? Any more cleanup need
> to be done?

I would only produce a warning, we are also just continuing when we cannot
verify the build-id. Or, if we want to make that also an error then we probably
have to do what stap_uprobe_change() at the end when things fail, set the state
to error and _stp_exit().

> 4. More consideration?

I am not sure about limiting the build-id comparison to MAXSTRINGLEN. If we are
afraid of people creating scripts against modules with absurdly big build-ids
then we should block those in the translator imho.

Comment 6 Mark Wielaard 2009-03-17 13:22:30 UTC

(In reply to comment #5)
> > 2. Searching the found task in _stp_modules through tgt->pathname/vm_path.  Is
> > it enough?
> 
> I believe this is enough, the vm_path is the complete canonical path. We do need
> to make sure to register the module names also with their full canonical names.

I made sure they are through the following commit. The path is now used in the
taskfinder vma tracker to compare against the vm_path. Maybe name could be
completely replaced by path, but for the kernel and modules name is kind of
special (we do check for the "kernel" name in a couple of places). So for now it
is just an extra field for use at runtime.

commit 30cb532a560ed152b86506b80490e99195970271
Author: Mark Wielaard <mjw@redhat.com>
Date:   Tue Mar 17 13:50:33 2009 +0100

    Get the canonical path of the main file for comparison at runtime.
    
    When given directly by the user through -d or in case of the kernel
    name and path might differ. path should be used for matching.
    
    * runtime/sym.h (_stp_module): Add path field.
    * runtime/task_finder.c (__stp_tf_vm_cb): Use module path to compare vm_path
    * translate.cxx (dump_unwindsyms): Output canonical path.

Comment 7 Wenji Huang 2009-03-27 03:33:53 UTC

Created attachment 3850 [details]
Updated patch

Changed to the old one.
1. Once build-id check finished, early return
2. Trigger warning if build-id doesn't match and make finder.callback return.
3. build-id checking limited to map_p in vma.callback
4. Use module path to compare vm_path. (thanks to Mark Wielaard)

Did some tests on FC10. Looks fine.

Need to be improve:
1. Still need to hook stap_uprobe_process_found explicitly for an EXEC
finder.callback since vma.callback == NULL.
2. The annoying problem is that the validation procedure will be repeated 
many times. That depends on callbacks. 

Any idea?

Comment 8 Frank Ch. Eigler 2009-03-27 11:03:32 UTC

(In reply to comment #7)
> 1. Still need to hook stap_uprobe_process_found explicitly for an EXEC
> finder.callback since vma.callback == NULL.

Probably a task-finder change is needed.

> 2. The annoying problem is that the validation procedure will be repeated 
> many times. That depends on callbacks. 

Can you explain the nature of the repetition?

Comment 9 Wenji Huang 2009-03-27 13:01:42 UTC

(In reply to comment #8)
> (In reply to comment #7)
> > 1. Still need to hook stap_uprobe_process_found explicitly for an EXEC
> > finder.callback since vma.callback == NULL.
> 
> Probably a task-finder change is needed.
> 
> > 2. The annoying problem is that the validation procedure will be repeated 
> > many times. That depends on callbacks. 
> 
> Can you explain the nature of the repetition?
> 
Sorry, maybe I havn't expressed it clearly. I mean that current code can't 
ensure checking build-id only once. Once the callback is invoked, checking
procedure may be called too, especially process("foo").function("*") or vma
splitted.

Comment 10 David Smith 2009-03-27 14:07:05 UTC

(In reply to comment #7)
> Need to be improve:
> 1. Still need to hook stap_uprobe_process_found explicitly for an EXEC
> finder.callback since vma.callback == NULL.

Can you explain this one a bit further?  Isn't stap_uprobe_process_found()
already getting called on exec events?

Comment 11 Wenji Huang 2009-03-28 01:50:53 UTC

(In reply to comment #10)
> (In reply to comment #7)
> > Need to be improve:
> > 1. Still need to hook stap_uprobe_process_found explicitly for an EXEC
> > finder.callback since vma.callback == NULL.
> 
> Can you explain this one a bit further?  Isn't stap_uprobe_process_found()
> already getting called on exec events?
stap_uprobe_process_found is OK. Please see comment #5.

Comment 12 Wenji Huang 2009-04-13 07:52:32 UTC

Created attachment 3882 [details]
Updated patch -v3

 Guess the code now can be matured since related part PR9940 is resolved. 
Also, the flag
note_sect in stp_modules will ensure checking build-id only once.

Comment 13 Mark Wielaard 2010-10-05 17:45:22 UTC

The code has been slightly reworked. The code shouldn't have to add its own hooks, but can be called from _stp_vma_mmap_cb (runtime/vma.c).

Currently this code only checks the file name paths match:

    if (strcmp(path, _stp_modules[i]->path) == 0)

That check can be augmented, or replaced by the build-id check.

Comment 14 Mark Wielaard 2010-10-05 17:55:00 UTC

Also note that there is a simple build-id check just for the vdso hooked from _stp_vma_exec_cb in _stp_vma_match_vdso (runtime/vma.c). You might be able to reuse some of this with the vdso_addr replaced with the addr given in the _stp_vma_mmap_cb hook.

Comment 15 Frank Ch. Eigler 2010-10-05 17:59:02 UTC

Note that the translator-side build-id address calculation logic
may need to switch to emitting proper relocation bases (a la
the _stp_relocate_address section-name argument) for the
ET_DYN / ET_EXEC user-space cases, instead of storing a
single little offset.

Comment 16 Stan Cox 2010-10-20 18:13:59 UTC

Created attachment 5074 [details]
9937 patch

Comment 17 Stan Cox 2010-10-20 18:15:27 UTC

(In reply to comment #16)

Add build id support for user modules.  The build id is an offset which is
adjusted relative to dwarf_module_base for ET_EXEC or the load address from the
task manager for ET_DYN.

	* sym.c (_stp_build_id_check): New.  Moved from... (plus addition
	of user_module parameter)
	(_stp_build_id_check): ...here.
	(_stp_userprog_check): New.  Build id check for user modules.
	* sym.h (dwarf_module_base): New.
	* uprobes-common.c (stap_uprobe_change_plus): Call _stp_userprog_check.
	* translate.cxx (dump_unwindsyms): Don't stop if it isn't a kernel
	module.
	Initialize module->dwarf_module_base.  Output module->build_id_offset
	for .dynamic case.

Comment 18 Stan Cox 2010-10-21 20:16:57 UTC

(uprobes*prelink is the only regression with the above patch)

Comment 19 Frank Ch. Eigler 2010-10-23 15:29:39 UTC

Looks like a good start.
A few problems:

The probe_kernel_* alternative is based on an autoconf macro
to determine the presence of the probe_kernel_* functions,
*not* whether we're trying to check a kernel- or user-space
build-id string.

I don't understand why the dwfl_module_getdwarf / dwarf_module_base
stuff is at all relevant here.  The build-id is not debuginfo
based.  It's as if you're adding then subtracting the same dwbias
value, to no effect at all.

Instead, the code should reuse/extend the explicit
dwfl_module_relocation_info* calculations as used for
kernel modules up near line 4895, which I see it's trying
to do (though the comments need to be expanded to explain).
It needs to respect/store the secname instead of just bypassing
the .note.gnu.build-id assertion (which is correct only for
kernel modules).

Comment 20 Stan Cox 2010-10-27 16:07:52 UTC

> I don't understand why the dwfl_module_getdwarf / dwarf_module_base
> stuff is at all relevant here. 

Yea the dwfl_module_getdwarf bit was unnecessary.  For ET_DYN the load address from task manager is used.  An equivalent is needed for ET_EXEC; "base" will do that so it is now saved.

Comment 21 Stan Cox 2010-10-27 16:10:28 UTC

Created attachment 5091 [details]
user mode build-id patch

Comment 22 Stan Cox 2010-11-08 15:10:24 UTC

commit: 155bb07a5bfdd

Add build id support for user modules.