Add a function or extend printf to dump a kernel structure in a human readable format, similar to what gdb can do. Does dwarf give us enough info to do this without requiring the script to specify the structure type? For example, something like this: print_struct($sock) where $sock is a pointer to a struct socket could output this: { state = 1, flags = 6, ops = 0x83337444, fasync_list = 0x84435355, file = 0x54338234, sk = 0x73322556, wait = { lock = 1, task_list = { next = 0x822334455, prev = 0x855443322 }, }, type = 0 }
*** Bug 5954 has been marked as a duplicate of this bug. ***
Another possible syntax for this is inspired by the $$vars introduced recently. Expanding struct contents could be represented like so: $var$ => a string representation of $var's fields: like 0xfoo or {.foo=0xbeef, .zoo=0xp00} To control the depth of nesting expansion, we could add extra "$"s at the end: $var$$ => {.foo=0xbeef, .bar={.so=0x44, .po=0x848}} This could compose with the $$ variables too: $$vars$ => var1=0xdead var2={.foo=0xbeef, .zoo=0xp00}
(In reply to comment #2) > Another possible syntax for this is inspired by the $$vars introduced > recently. Expanding struct contents could be represented like so: > How about $$$var syntax? It also could compose $$vars as $$$vars, or $$$parms. and depth also be increased by adding $. ($$$$var) > $var$ => a string representation of $var's fields: like > 0xfoo > or {.foo=0xbeef, .zoo=0xp00} I like latter format :-) Thanks,
(I'm dumping some thoughts as I look at implementing this...) I feel like we should have some token separation instead of a single token $foo$. My first idea is to list it as a trailing dereference, like so: $foo->$ $foo->$$ $$parms->$ $foo->bar[i]->$ @cast(foo, "foo_t")->$ Treating it as a new dereferencing component makes it clearer to me that it's digging into the structure, and also makes it clearer IMO to connect to arrays and @casts. Otherwise we have to do a token peek on things like $foo[i]$ or @cast(foo, "foo_t")$ to see that they're followed by a dollar sign. Maybe that's not so bad though... Should we traverse pointers as well? Maybe that is what is meant by "depth of nesting expansion". So, $ would print the entire struct, including any nested structs. Then $$ would expand a single level of pointers beyond that, and so on. Unions are another open question, especially if pointers are involved. I could see us quickly getting bad derefs by walking down a wrong union branch. My first inclination is to skip over unions and print them as a "{...}" black box. Or, maybe we can print them, but then skip them in pointer traversal as a special case.
Recursive structs (linked lists and the like) could also be quite tricky.
Created attachment 4729 [details] devel snapshot I haven't worked on this in a bit, but here's the snapshot of where I stopped. The function dwarf_pretty_print_target_symbol lacks any working implementation, but this patch at least has comments there of what I was trying to do. I /think/ that function can glue the rest together fairly simply, but that's where I stalled out...
*** Bug 6837 has been marked as a duplicate of this bug. ***
commit 5f36109ef05d8399e6369c0487a0a17d40ad3267 Author: Josh Stone <jistone@redhat.com> Date: Thu May 27 15:54:01 2010 -0700 PR3672: Add pretty-printing for compound types This adds a new syntax for pretty-printing variables as strings: $var$ $var$$ $var->$ $var->$$ $@cast(...)$ $@cast(...)$$ $@cast(...)->$ $@cast(...)->$$ $var->foo->$ $var[1]->$ $@cast(...)->foo$ $@cast(...)[2]$ This is still a work in progress, but I deemed it now useful enough to share. See PR3672 for discussion of work remaining. I'll follow up shortly with status & remaining issues...
I think it's basically in good shape, but here are the things that I know are lacking: • Determine the size of arrays. I don't know how to read that from DWARF, so for now we just print the first element and "..." the rest. Even when we do know the full size, we will probably only want the first few anyway. • Truncate huge types with "...". Right now I've tested that structs like the kernel's task_struct and stap's systemtap_session both generate reasonable-looking code for pass-2, but both are way too big to fit in a normal string. Pass-3 will actually reject these for having too many parameters for the stack anyway. • Using base10 or base16 -- I've chosen to represent everything with %c, %i, %u, or %p, but in other parts of our code we tend to use just %#x. I think that decimal is generally more human-friendly for numbers that aren't pointers, although flag variables are nice in hex. There's no DW_ATE_flag though... • Hide the dirty laundry of inheritance -- C++ types with virtual functions get members like "_vptr.foo" to resolve the functions, but that's not really useful for users to see. We could automatically skip members with that naming pattern. • Test test test -- I haven't written any testcases yet; shame on me... Enhancements TODO: • Stringify char*/char[] with user_string or kernel_string, as merged here from PR6837. • Add magic for STL types. For example, std::string currently looks like "{._M_dataplus={._M_p=%p}}", but we could hide the layout and turn it into user_string on that _M_p instead. Future implementation cleanups: • Refactor comp_pretty_print -- given that this "component" only ever makes sense at the tail of the target_symbol, it may be overkill to be a component at all. It might be cleaner to just have a pretty_print_depth member instead. • Symbol/function referent tracking - I had to directly assign the referents on my generated code, because later parts of the translator never touched it. I think it's just missing the step that walks over new functions to resolve referents. • Merge various function generators -- tapsets.cxx has a few places now that do almost the same thing to create a new function for variable access. It would be nice to share more code among these.
Here's some clarification on what is currently implemented, as it was a little confusing and controversial on IRC, and we may want to revise it. $var$ and $var->$ are identical, and will print the entire flattened structure and all substructures. If $var happens to be a pointer to start with, that pointer is dereferenced for free. If any members happen to be pointers, they will be printed as %p and not traversed. Using $var$$ or $var->$$ digs deeper, which currently means it will expand one level of pointers/references within the structure. $var$$$ and $var->$$$ will expand yet another level of pointers; continue ad nauseam. A current deficiency is that pointers are blindly attempted, even if NULL or otherwise bad, which will error out the script -- we should probably add try-catch for this. Some proposed modifications I got from hecklers: - Forget the $var->$ syntax and just go with $var$. - Or give them separate meaning, e.g. $var$ on a pointer will just print the pointer value, $var->$ will dereference first. - Change the "depth" to refer to substructures instead of pointers, and then never follow pointers at all. This might even be bimodal, so $ means no substructures, $$ means fully deep into all substructures, and then don't bother with $$$... - If we keep the idea of controlled depth, then offer a more compressed form, perhaps $var$10$ instead of $var$$$$$$$$$$$. (The kookiness is apparently contagious.)
(In reply to comment #9) > Enhancements TODO: Another idea is to print enums by name.
(In reply to comment #10) > - Change the "depth" to refer to substructures instead of pointers, and then > never follow pointers at all. This might even be bimodal, so $ means no > substructures, $$ means fully deep into all substructures, and then don't bother > with $$$... I've made this change in commit 7d11d8c9.
Basic docs and tests are added in commit 34af38d. We can consider the other discussed enhancements as incremental efforts in the future.