This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: revamp sdt.h


Thanks a lot Roland,

I have modified your sdt.h slightly and used it for the pthread
probes. Is there a special branch of systemtap or utrace that I need
to use in order to test/benchmark the overhead of the existance of the
new sdt probes in libpthread?

Rayson



On Mon, Aug 2, 2010 at 11:06 PM, Roland McGrath <roland@redhat.com> wrote:
> I have spoken before about some of the shortcomings of the .probes
> section format the sdt.h macros generate. ?(I'm not really sure how
> much I've written that in any postings here and how much it may have
> been only in verbal grumblings in some unrecorded voice meetings.)
> With Rayson's recent work, we've also noted the need to have sdt.h
> macros that can work with hand-written assembly code.
>
> So here is a first discussion draft of an entirely revamped set of
> sdt.h macros and binary format they generate. ?There is no conceptual
> change here at all, it is just a new encoding of exactly the same
> information as today's v2 sdt.h probes. ?The only change to the
> translator is the new binary format decoder. ?Actually, that's not
> true, but the changes are small and I'll explain them all in a moment.
>
> Two files are attached at the end, which you should read or skim along
> with my explanation here. ?First is the core macro nest of the new
> sdt.h, with some example macro uses. ?The other file is a small
> standalone C program based on libelf (elfutils >= 0.130) that decodes
> the new binary format and prints out the probes. ?That can serve as
> the model for the new translator code.
>
> The essential macros in this first draft are actually pretty complete
> and usable (I didn't include all the sugar, just enough for examples).
> The one thing they are not is friendly to -pedantic (unless used with
> -std=c99). ?They use variadic macros heavily and I'm not sure I could
> have hashed this out without them and not become homicidal. ?But
> chances are we can rejigger the macro nest without them later if we
> have to, with only slight dangers to life and limb of bystanders.
> Anyway, not an issue for a discussion draft.
>
> This version addresses these issues done poorly by the existing stuff,
> some of which are purely about the macros and some of which are about
> the format itself.
>
> * can be used in assembly (.S) source files
> * can be used inside inline asm statements in C source
>
> Both of these matter for the places probes should go in libpthread functions.
>
> * no data relocs
>
> The old formats are non-starters for libc/libpthread, where the number
> of dynamic relocs of any kind is very carefully tuned to keep the
> startup cost on every program in the system as small as possible.
>
> * minimal memory footprint of any kind
>
> The cost is exactly one byte of rodata (rounded to alignment, so one
> word at least) total in the final file, plus just the size of the nop
> instruction itself times the number of probes.
>
> These last two are achieved by putting the data into a non-allocated
> ELF note. ?This has some nice properties we get for free:
> * no runtime cost, it's all fixed at link time and never in memory
> * preserved in both stripped files and .debug files
> It also has one new wrinkle we didn't have before (which is the flip
> side of not having any dynamic relocs), which is that prelink won't
> adjust its contents for address offsets.
>
> One drawback of using a naked ELF note for every probe is that there
> is a proportionally large per-probe overhead for the note headers.
> But that just means something like another 20 bytes on top of the 16
> or 24 you might have had for each probe, before counting the name
> strings. ?We could make a much more compact note format if we wanted
> to rely on a link-time step. ?But the absolute numbers involved in the
> size of the notes are still pretty small (I think it's smaller per
> probe than v2 .probes is, and it's just ELF file size instead of being
> runtime memory footprint). ?IMHO there is quite a lot to be said for
> '#include <sys/sdt.h>' (and maybe later -lsdt, but now not even that)
> being the sum total of extra fiddling to an existing build setup
> needed to add static probes.
>
> Ok, now it's time to look at new-sdt.h, attached below. ?You can just
> look at /* Example uses */ and below for the moment. ?That file can be
> compiled either as C or as assembly to show those examples in a binary.
> (The assembly is for x86-64, though you can trivially change the operand
> expressions to something that will work on another machine if you want
> to see an example there.)
>
> The scenario below is for building a DSO. ?You could just as well drop
> the -fPIC and -shared flags and create an executable instead. ?I'm only
> showing the one example because both are really just the same, and the
> DSO case lets me illustrate the prelink issue.
>
> ? ? ? ?$ gcc -c -o s.o -xc new-sdt.h -O2 ?-fPIC
> ? ? ? ?$ gcc -c -o s2.o -Dfrob=diddle -Dmain=dummy -xc new-sdt.h -O2 -fPIC
> ? ? ? ?$ gcc -c -o s3.o -x assembler-with-cpp new-sdt.h -O2 -fPIC
> ? ? ? ?$ gcc -shared -o s.so s.o s2.o s3.o
>
> Ok. ?So now we compiled two objects from C sources and one from assembly
> sources, and linked those together into a DSO (or executable). ?The
> different objects use overlapping sets of provider and probe names,
> i.e. some probes have instances in two of the objects.
>
> Now let's build the little decoder program:
>
> ? ? ? ?$ gcc -std=gnu99 -g sdt-extractor.c -o sdt-extractor -lelf
>
> And now we can run it:
>
> ? ? ? ?$ ./sdt-extractor s.so
> ? ? ? ?0x5a0 ? libfoo.noargs ? ? ? ? ? ? ?:
> ? ? ? ?0x5a1 ? libfoo.frob ? ? ? ? ? ? ? ?-4@%edi 4@(%rsi)
> ? ? ? ?0x5a2 ? libfoo.diddle ? ? ? ? ? ? ?8@%rsi -4@%edi
> ? ? ? ?0x5a3 ? libfoo.asm_noargs
> ? ? ? ?0x5a4 ? libfoo.asmfrob ? ? ? ? ? ? %edi %rax (%rsi)
> ? ? ? ?0x5af ? libfoo.asmfrobarg ? ? ? ? ?4@(%rsi,%rdi,4) 8@%rax 4@$2
> ? ? ? ?0x5d0 ? libfoo.noargs ? ? ? ? ? ? ?:
> ? ? ? ?0x5d1 ? libfoo.diddle ? ? ? ? ? ? ?-4@%edi 4@(%rsi)
> ? ? ? ?0x5d2 ? libfoo.diddle ? ? ? ? ? ? ?8@%rsi -4@%edi
> ? ? ? ?0x5d3 ? libfoo.asm_noargs
> ? ? ? ?0x5d4 ? libfoo.asmfrob ? ? ? ? ? ? %edi %rax (%rsi)
> ? ? ? ?0x5df ? libfoo.asmfrobarg ? ? ? ? ?4@(%rsi,%rdi,4) 8@%rax 4@$2
> ? ? ? ?0x5e4 ? libfoo.noargs ? ? ? ? ? ? ?:
> ? ? ? ?0x5e5 ? libfoo.frob ? ? ? ? ? ? ? ?%rax, -20(%rbp)
> ? ? ? ?0x5e6 ? libfoo.diddle ? ? ? ? ? ? ?(%rdi), %rax
>
> As you can see, we have a probe address, a provider name, a probe name,
> and an argument format string. ?Don't worry yet about the argument
> details. ?I'll get to that after covering the prelink issue.
>
> So, all this probe information is stored in the .note.stapsdt section,
> which is not allocated data, has no relocs, and does not get touched by
> prelink. ?So the probe addresses stored in there at link time stay as
> they started. ?But, prelink might adjust the actual text addresses:
>
> ? ? ? ?$ prelink -r 0x1000000 s.so
> ? ? ? ?$ ./sdt-extractor s.so
> ? ? ? ?0x10005a0 ? ? ? libfoo.noargs ? ? ? ? ? ? ?:
> ? ? ? ?0x10005a1 ? ? ? libfoo.frob ? ? ? ? ? ? ? ?-4@%edi 4@(%rsi)
> ? ? ? ?0x10005a2 ? ? ? libfoo.diddle ? ? ? ? ? ? ?8@%rsi -4@%edi
> ? ? ? ?0x10005a3 ? ? ? libfoo.asm_noargs
> ? ? ? ?0x10005a4 ? ? ? libfoo.asmfrob ? ? ? ? ? ? %edi %rax (%rsi)
> ? ? ? ?0x10005af ? ? ? libfoo.asmfrobarg ? ? ? ? ?4@(%rsi,%rdi,4) 8@%rax 4@$2
> ? ? ? ?0x10005d0 ? ? ? libfoo.noargs ? ? ? ? ? ? ?:
> ? ? ? ?0x10005d1 ? ? ? libfoo.diddle ? ? ? ? ? ? ?-4@%edi 4@(%rsi)
> ? ? ? ?0x10005d2 ? ? ? libfoo.diddle ? ? ? ? ? ? ?8@%rsi -4@%edi
> ? ? ? ?0x10005d3 ? ? ? libfoo.asm_noargs
> ? ? ? ?0x10005d4 ? ? ? libfoo.asmfrob ? ? ? ? ? ? %edi %rax (%rsi)
> ? ? ? ?0x10005df ? ? ? libfoo.asmfrobarg ? ? ? ? ?4@(%rsi,%rdi,4) 8@%rax 4@$2
> ? ? ? ?0x10005e4 ? ? ? libfoo.noargs ? ? ? ? ? ? ?:
> ? ? ? ?0x10005e5 ? ? ? libfoo.frob ? ? ? ? ? ? ? ?%rax, -20(%rbp)
> ? ? ? ?0x10005e6 ? ? ? libfoo.diddle ? ? ? ? ? ? ?(%rdi), %rax
> ? ? ? ?$
>
> As you can see, everything is still correct: the probe addresses got the
> prelink offset applied, and nothing else changed. ?So how does this work?
>
> It uses the .stapsdt.base section. ?This is a special section we add to
> the text. ?All the .ifndef and comdat magic in the macro for this is
> just there so that we only ever have one of these sections in a final
> link and it's only ever one byte long. ?Really it could be 0 bytes long,
> but the linker swallows the section if we make it empty, so we pad it
> with a byte (and alignment padding will usually mean that it consumes at
> least one word in the binary's text segment). ?Nothing about this
> section itself matters, we just use it as a marker to detect prelink
> address adjustments.
>
> Each probe note records the link-time address of the .stapsdt.base
> section alongside the probe PC address. ?The decoder compares the base
> address stored in the note with the .stapsdt.base section's sh_addr.
> Initially these are the same, but the section header will be adjusted by
> prelink. ?So the decoder applies the difference to the probe PC address
> to get the correct prelinked PC address.
>
> I've put this magic into the macro and note format unconditionally, but
> none of that is necessary for executables. ?We could make it conditional
> on #ifdef __PIC__. ?But the cost (a word per note, plus the 1-byte
> section of runtime rodata) seems small enough that it's nicer not to
> bother with two variants of the format.
>
> A library or application built using a custom linker script could
> possibly remove, rename, or hide the .stapsdt.base section. ?But that is
> a rare thing to do (and even with some custom linker scripts, it may
> well come through fine). ?We do rely on the decoder in the translator
> being able to find that section by name, but that is certainly no more
> than the old .probes schemes relied on.
>
>
> Now, some notes about the note format.
>
> Note that the name of the notes section is not normative, and in a final
> executable/DSO you might actually be looking at intermixed notes of
> other kinds (follow the sdt-extractor.c example to consider all
> appropriate sections and check all notes in them via gelf_getnote).
>
> The ELF note format is variable-sized and includes a "vendor string" and
> a type code. ?Both the header and the "payload" after that are aligned
> to 4 bytes within the section.
>
> We're using the string "stapsdt" and that give us complete control of
> the meaning of that (32-bit) type code (GElf_Nhdr.n_type). ?So if we
> want to have different flavors of probes, or different encoding formats,
> now or in the future, we can encode all such selections in that type
> code. ?For this discussion draft, I'm using just one flavor (intended
> for uprobes probes, i.e. a nop) and n_type=3 (for "sdt v3").
>
> After the note header, the n_descsz bytes are:
>
> ? ? ? ?probe PC address (4 or 8 bytes)
> ? ? ? ?link-time sh_addr of .stapsdt.base section (4 or 8 bytes)
> ? ? ? ?provider name (null-terminated string)
> ? ? ? ?probe name (null-terminated string)
> ? ? ? ?argument format (null-terminated string)
>
> Finally, I've made some changes to the v2 argument format string,
> some trivial and one substantive.
>
> * For no arguments, the string can be either "" or ":".
> * Arguments can be separated by commas, whitespace, or both.
>
> These differences are just for the convenience of writing the macro nest.
>
> * Sized arguments.
>
> In looking at the proposed libpthread probes using v2 sdt.h, I noticed
> that adding the probes introduced not only the nop instructions
> themselves, but some extra code before them to sign-extend or
> zero-extend int arguments (on the hot path, even adding register
> pressure!). ?We really don't want that perturbation of the code
> generation just for the common situation of having int-typed probe
> parameters on a 64-bit machine.
>
> So, these macros do not cast a probe argument to size_t as the existing
> macros do. ?Instead, they just make it an rvalue of int or wider (by
> doing a plain + 0). ?That coerces short (and bitfields and whatever) to
> int, and coerces array references to pointers. ?So arguments will still
> wind up integers, and be either 32 or 64 bits (on a 32-bit machine,
> there could be a 64-bit probe argument, which would be forced into a
> memory operand).
>
> This is encoded in the argument format string. ?Each argument might
> still be a plain assembly operand (from hand-written assembly), in which
> case you should assume it's meant to be natural word size, or perhaps
> the word size indicated by the register syntax (e.g. %eax or %r11d on
> x86-64 mean the low 32 bits only). ?But normally each argument will look
> like "N@OP" where OP is the actual assembly operand, and N is one of:
>
> ? ? ? ?4 ? ? ? 32 bits unsigned
> ? ? ? ?-4 ? ? ?32 bits signed
> ? ? ? ?8 ? ? ? 64 bits unsigned
> ? ? ? ?-8 ? ? ?64 bits signed
>
> The signedness doesn't really matter for 64 bits, though you could
> potentially still use it to choose %d vs %u formatting for $parms$
> and that sort of thing. ?The -4@ notation tells you that you need to
> extract it as 32 bits (low 32 of a register, or only address 4 bytes
> if a memory access) and sign-extend it to 64 bits for a stap long.
>
> This shifts the work of sign extension (when you want it) to the
> translator/generated probe runtime code, rather than putting it into
> the probed hot path code to be run even when no probes are in use.
> With this, we can choose probe points and arguments carefully for
> libpthread/libc and reasonably expect not to perturb the generated
> code at all beyond the actual nop insertions.
>
>
> I think I've explained everything.
> The discussion draft for sdt.h glosses over some trivial nits,
> but I think I was pretty thorough about all the important nits.
>
> The one thing I didn't mention is the semaphore option. ?There isn't
> one. ?I can't tell what the story is with the semaphore these days, but
> it looks like we're not really doing that any more. ?If we want it in,
> or even optionally in at compile time, then it is easy enough to add it
> to these macros, and use new n_type values to indicate with vs without
> variants of the note format.
>
>
> Thanks,
> Roland
>
>
>
> #ifdef __ASSEMBLER__
> # define _SDT_PROBE(provider, name, arg_format, ...) \
> ?_SDT_ASM_BODY(provider, name, arg_format, __VA_ARGS__)
> # define _SDT_ASM_1(...) ? ? ? ? ? ? ? ?__VA_ARGS__;
> # define _SDT_ASM_STRING_1(...) ? ? ? ? .asciz #__VA_ARGS__;
> # define _SDT_ASM_ARGS(format, ...) ? ? _SDT_ASM_STRING_1(__VA_ARGS__)
> # define _SDT_ARG(n, x) ? ? ? ? ? ? ? ? x
> #else
> # define _SDT_PROBE(provider, name, arg_format, ...) __asm__ __volatile__ \
> ?(_SDT_ASM_BODY(provider, name, arg_format, :) :: __VA_ARGS__)
> # define _SDT_ASM_1(...) ? ? ? ? ? ? ? ?#__VA_ARGS__ "\n"
> # define _SDT_ASM_STRING_1(...) ? ? ? ? _SDT_ASM_1(.asciz #__VA_ARGS__)
> # define _SDT_ASM_ARGS(format, ...) ? ? _SDT_ASM_STRING_1(format)
> # define _SDT_ARGFMT(n) ? ? ? ? ? ? ? ? %c[_SDT_S##n]@_SDT_ARGTMPL(_SDT_A##n)
> # define _SDT_ARG(n, x) ? ? ? ? ? ? ? ? \
> ?[_SDT_S##n] "n" ((__builtin_constant_p ((x) + 0 < 0) ? 1 : -1) \
> ? ? ? ? ? ? ? ? ? * (int) sizeof ((x) + 0)), ? ? ? ? ? ?\
> ?[_SDT_A##n] "nor" ((x) + 0)
> #endif
> #define _SDT_ASM(...) ? ? ? ? ? ? ? ? ? _SDT_ASM_1(__VA_ARGS__)
> #define _SDT_ASM_STRING(...) ? ? ? ? ? ?_SDT_ASM_STRING_1(__VA_ARGS__)
>
> #if defined __powerpc__ || defined __powerpc64__
> # define _SDT_ARGTMPL(id) ? ? ? %I[id]%[id]
> #else
> # define _SDT_ARGTMPL(id) ? ? ? %[id]
> #endif
>
> #include <bits/wordsize.h>
> #if __WORDSIZE == 64
> # define _SDT_ASM_ADDR ?.quad
> #else
> # define _SDT_ASM_ADDR ?.long
> #endif
>
> #define _SDT_NOP ? ? ? ?nop
>
> #define _SDT_NOTE_NAME ?"stapsdt"
> #define _SDT_NOTE_TYPE ?3
>
> #define _SDT_ASM_BODY(provider, name, arg_format, ...) ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM(990: _SDT_NOP) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?_SDT_ASM( ? ? .section .note.stapsdt,"","note") ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?_SDT_ASM( ? ? .balign 4) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM( ? ? .int 992f-991f, 994f-993f, _SDT_NOTE_TYPE) ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM(991: .asciz _SDT_NOTE_NAME) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM(992: .balign 4) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM(993: _SDT_ASM_ADDR 990b) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?_SDT_ASM( ? ? _SDT_ASM_ADDR _.stapsdt.base) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?_SDT_ASM_STRING(provider) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?_SDT_ASM_STRING(name) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?_SDT_ASM_ARGS(arg_format, __VA_ARGS__) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM(994: .balign 4) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM( ? ? .previous) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM(.ifndef _.stapsdt.base) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM( ? ? .section .stapsdt.base,"aG","progbits",.stapsdt.base,comdat) ?\
> ?_SDT_ASM( ? ? .weak _.stapsdt.base) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?_SDT_ASM( ? ? .hidden _.stapsdt.base) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?_SDT_ASM(_.stapsdt.base: .space 1) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM( ? ? .size _.stapsdt.base, 1) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM( ? ? .previous) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ?_SDT_ASM(.endif)
>
> #define PROBE0(provider, name) \
> ?_SDT_PROBE(provider, name, :, :)
> #define PROBE1(provider, name, arg1) \
> ?_SDT_PROBE(provider, name, _SDT_ARGFMT(1), _SDT_ARG(1, arg1))
> #define PROBE2(provider, name, arg1, arg2) \
> ?_SDT_PROBE(provider, name, _SDT_ARGFMT(1) _SDT_ARGFMT(2), \
> ? ? ? ? ? ? _SDT_ARG(1, arg1), _SDT_ARG(2, arg2))
>
> #define PROBE_ASM(provider, name, ...) ? ? ? ? ?\
> ?_SDT_ASM_BODY(provider, name, __VA_ARGS__, :)
> #define PROBE_ASM_TEMPLATE(n) ? ? ? ? ? _SDT_ASM_TEMPLATE_##n
> #define PROBE_ASM_OPERANDS(n, ...) ? ? ?_SDT_ASM_OPERANDS_##n(__VA_ARGS__)
> #define _SDT_ASM_TEMPLATE_0 ? ? ? ? ? ? :
> #define _SDT_ASM_TEMPLATE_1 ? ? ? ? ? ? _SDT_ARGFMT(1)
> #define _SDT_ASM_TEMPLATE_2 ? ? ? ? ? ? _SDT_ASM_TEMPLATE_1 _SDT_ARGFMT(2)
> #define _SDT_ASM_TEMPLATE_3 ? ? ? ? ? ? _SDT_ASM_TEMPLATE_2 _SDT_ARGFMT(3)
> #define _SDT_ASM_OPERANDS_0() ? ? ? ? ? /* no operands */
> #define _SDT_ASM_OPERANDS_1(arg1) ? ? ? _SDT_ARG(1, arg1)
> #define _SDT_ASM_OPERANDS_2(arg1, arg2) _SDT_ARG(1, arg1), _SDT_ARG(2, arg2)
> #define _SDT_ASM_OPERANDS_3(arg1, arg2, arg3) ? \
> ?_SDT_ARG(1, arg1), _SDT_ARG(2, arg2), _SDT_ARG(3, arg3)
>
>
> /* Example uses */
>
> #define LIB libfoo ? ?/* Probe do macros support indirecting the names. ?*/
>
> #ifdef __ASSEMBLER__
>
> #define ARG1 %rax
> #define ARG2 -20(%rbp)
>
> /* Here in an assembly source file, probes look just like in C source.
> ? The arguments are assembly operands that the sdt decoder can grok;
> ? e.g. constants might need to be marked, etc. ?*/
> PROBE0(LIB, noargs)
> PROBE2(LIB, frob, ARG1, ARG2)
> PROBE2(LIB, diddle, (%rdi), %rax)
>
> #else
>
> struct bar { unsigned int baz; short int spaz; };
>
> void frob (int foo, struct bar *bar)
> {
> ?/* Plain C use is as before. ?*/
> ?PROBE0(LIB, noargs);
> ?PROBE2(LIB, frob, foo, bar->baz);
> ?PROBE2(LIB, diddle, bar, bar->spaz);
>
> ?/* Here's a use inside traditional inline asm.
> ? ? Note that GCC does not do %format handling in this case. ?*/
> ?__asm (PROBE_ASM(LIB, asm_noargs)
> ? ? ? ? "# standalone asm: %0 et al not translated, no %% needed");
>
> ?/* Here's a use inside a fancy GCC asm using operands from C.
> ? ? Here the asm writer is choosing which assembly operands to
> ? ? tell sdt, just like writing a probe in an assembly source file.
> ? ? Note spaces with no commas between the operands.
> ? ? Those might or might not be substituted GCC %format thingies. ?*/
> ?__asm volatile ("# do something with %0\n"
> ? ? ? ? ? ? ? ? ?PROBE_ASM(LIB, asmfrob, %0 %%rax %1)
> ? ? ? ? ? ? ? ? ?"# do something with %1"
> ? ? ? ? ? ? ? ? ?: : "r" (foo), "m" (bar->baz));
>
> ?/* Here's an asm use where the probe arguments are specified separately
> ? ? in C, so they behave just like a plain C probe would. ?The
> ? ? PROBE_ASM_TEMPLATE(n) macro says we have n arguments from C.
> ? ? Then PROBE_ASM_OPERANDS(n, ...) can appear anywhere in the
> ? ? asm's list of input operands. ?*/
> ?const int fold[3] = { 1, 2, 3 };
> ?static int ugh[3] = { 1, 2, 3 }; /* array as arg demonstrates why + 0 */
> ?__asm volatile (PROBE_ASM(LIB, asmfrobarg, PROBE_ASM_TEMPLATE(3))
> ? ? ? ? ? ? ? ? ?"# magic insn uses no operands"
> ? ? ? ? ? ? ? ? ?: : PROBE_ASM_OPERANDS(3, bar[foo].baz, ugh, fold[1]));
> }
>
> int main () {}
>
> #endif
>
> #define _SDT_NOTE_NAME ?"stapsdt"
> #define _SDT_NOTE_TYPE ?3
>
> #define _GNU_SOURCE
> #include <gelf.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <error.h>
> #include <errno.h>
> #include <string.h>
> #include <inttypes.h>
> #include <assert.h>
> #include <stdio.h>
>
> static void
> handle_probe (Elf *elf, GElf_Addr base, int type, const char *data, size_t len)
> {
> ?if (type != _SDT_NOTE_TYPE)
> ? ?{
> ? ? ?error (0, 0, "unknown %s n_type %u", _SDT_NOTE_NAME, type);
> ? ? ?return;
> ? ?}
>
> ?union
> ?{
> ? ?Elf64_Addr a64[2];
> ? ?Elf32_Addr a32[2];
> ?} buf;
> ?Elf_Data dst =
> ? ?{
> ? ? ?.d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
> ? ? ?.d_buf = &buf, .d_size = gelf_fsize (elf, ELF_T_ADDR, 2, EV_CURRENT)
> ? ?};
> ?assert (dst.d_size <= sizeof buf);
>
> ?if (len < dst.d_size + 3)
> ? ?{
> ? ? ?error (0, 0, "short note");
> ? ? ?return;
> ? ?}
>
> ?Elf_Data src =
> ? ?{
> ? ? ?.d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
> ? ? ?.d_buf = (void *) data, .d_size = dst.d_size
> ? ?};
>
> ?if (gelf_xlatetom (elf, &dst, &src,
> ? ? ? ? ? ? ? ? ? ? elf_getident (elf, NULL)[EI_DATA]) == NULL)
> ? ?error (0, 0, "gelf_xlatetom: %s", elf_errmsg (-1));
>
> ?const char *provider = data + dst.d_size;
> ?const char *name = memchr (provider, '\0', data + len - provider);
> ?if (name == NULL)
> ? ?{
> ? ? ?error (0, 0, "corrupt probe");
> ? ? ?return;
> ? ?}
>
> ?++name;
> ?const char *args = memchr (name, '\0', data + len - name);
> ?if (args++ == NULL ||
> ? ? ?memchr (args, '\0', data + len - name) != data + len - 1)
> ?if (name == NULL)
> ? ?{
> ? ? ?error (0, 0, "corrupt probe");
> ? ? ?return;
> ? ?}
>
> ?GElf_Addr pc;
> ?GElf_Addr base_ref;
> ?if (gelf_getclass (elf) == ELFCLASS32)
> ? ?{
> ? ? ?pc = buf.a32[0];
> ? ? ?base_ref = buf.a32[1];
> ? ?}
> ?else
> ? ?{
> ? ? ?pc = buf.a64[0];
> ? ? ?base_ref = buf.a64[1];
> ? ?}
>
> ?pc += base - base_ref;
>
> ?printf ("%#" PRIx64 "\t%s.%-20s%s\n", pc, provider, name, args);
> }
>
> static void
> handle_notes (Elf *elf, Elf_Scn *scn, GElf_Addr base)
> {
> ?if (base == (GElf_Addr) -1)
> ? ?{
> ? ? ?error (0, 0, "notes before base section");
> ? ? ?base = 0;
> ? ?}
>
> ?Elf_Data *data = elf_getdata (scn, NULL);
> ?size_t next;
> ?GElf_Nhdr nhdr;
> ?size_t name_off;
> ?size_t desc_off;
> ?for (size_t offset = 0;
> ? ? ? (next = gelf_getnote (data, offset, &nhdr, &name_off, &desc_off)) > 0;
> ? ? ? offset = next)
> ? ?if (nhdr.n_namesz == sizeof _SDT_NOTE_NAME
> ? ? ? ?&& !memcmp (data->d_buf + name_off,
> ? ? ? ? ? ? ? ? ? ?_SDT_NOTE_NAME, sizeof _SDT_NOTE_NAME))
> ? ? ?handle_probe (elf, base,
> ? ? ? ? ? ? ? ? ? ?nhdr.n_type, data->d_buf + desc_off, nhdr.n_descsz);
> }
>
> static void
> handle_elf (Elf *elf)
> {
> ?size_t shstrndx;
> ?if (elf_getshdrstrndx (elf, &shstrndx))
> ? ?{
> ? ? ?error (0, 0, "elf_getshdrstrndx: %s", elf_errmsg (-1));
> ? ? ?return;
> ? ?}
>
> ?GElf_Addr base = -1;
>
> ?Elf_Scn *scn = NULL;
> ?while ((scn = elf_nextscn (elf, scn)) != NULL)
> ? ?{
> ? ? ?GElf_Shdr shdr;
> ? ? ?if (gelf_getshdr (scn, &shdr) == NULL)
> ? ? ? ?{
> ? ? ? ? ?error (0, 0, "elf_getshdr: %s", elf_errmsg (-1));
> ? ? ? ? ?continue;
> ? ? ? ?}
> ? ? ?switch (shdr.sh_type)
> ? ? ? ?{
> ? ? ? ?case SHT_NOTE:
> ? ? ? ? ?if (!(shdr.sh_flags & SHF_ALLOC))
> ? ? ? ? ? ?handle_notes (elf, scn, base);
> ? ? ? ? ?break;
>
> ? ? ? ?case SHT_PROGBITS:
> ? ? ? ? ?if (base == (GElf_Addr) -1
> ? ? ? ? ? ? ?&& (shdr.sh_flags & SHF_ALLOC) && shdr.sh_name != 0)
> ? ? ? ? ? ?{
> ? ? ? ? ? ? ?const char *scn_name = elf_strptr (elf, shstrndx, shdr.sh_name);
> ? ? ? ? ? ? ?if (scn_name != NULL && !strcmp (scn_name, ".stapsdt.base"))
> ? ? ? ? ? ? ? ?base = shdr.sh_addr;
> ? ? ? ? ? ?}
> ? ? ? ? ?break;
> ? ? ? ?}
> ? ?}
> }
>
> static void
> handle_file (const char *file)
> {
> ?int fd = open64 (file, O_RDONLY);
> ?if (fd < 0)
> ? ?error (0, errno, "%s", file);
> ?else
> ? ?{
> ? ? ?Elf *elf = elf_begin (fd, ELF_C_READ_MMAP_PRIVATE, NULL);
> ? ? ?if (elf == NULL)
> ? ? ? ?error (0, 0, "elf_begin: %s: %s", elf_errmsg (-1));
> ? ? ?else
> ? ? ? ?{
> ? ? ? ? ?handle_elf (elf);
> ? ? ? ? ?elf_end (elf);
> ? ? ? ?}
> ? ? ?close (fd);
> ? ?}
> }
>
> int
> main (int argc, char **argv)
> {
> ?elf_version (EV_CURRENT);
>
> ?for (int argi = 1; argi < argc; ++argi)
> ? ?handle_file (argv[argi]);
>
> ?return error_message_count > 0;
> }
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]