Bug 14396 - Missing DW_ATE_UTF support (char16_t, char32_t)
Summary: Missing DW_ATE_UTF support (char16_t, char32_t)
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-24 22:25 UTC by Mark Wielaard
Modified: 2012-08-11 01:04 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Add kernel_string_utf16/32 (1.51 KB, patch)
2012-08-10 00:56 UTC, Josh Stone
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2012-07-24 22:25:51 UTC
elfutils dwarf.h was missing the new DWARF4 DW_ATE_UTF define.
So systemtap also doesn't support this this data encoding.

Example usage:

#include <string.h>

const char *foo = "cow";
const char16_t *bar = u"bear";

int
main ()
{
  if (foo == "bear" && bar == u"cow")
    return 42;

  return 0;
}

$ g++ -g -std=c++0x -o utf utf.cxx

$ stap -e 'probe process.function("main") { log($foo$$); log($bar$$) }' -c ./utf
"cow"
4195852

Would be nice to see the $bar value also decoded.
Comment 1 Josh Stone 2012-07-24 22:39:36 UTC
(In reply to comment #0)
> $ stap -e 'probe process.function("main") { log($foo$$); log($bar$$) }' -c
> ./utf
> "cow"
> 4195852
> 
> Would be nice to see the $bar value also decoded.

dwarf_pretty_print::print_chars() uses user_string2/kernel_string2 to read strings bytewise.  We could add similar functions for UTF-16 and UTF-32 which convert to UTF-8 to make stap strings.
Comment 2 Josh Stone 2012-08-10 00:56:21 UTC
Created attachment 6570 [details]
Add kernel_string_utf16/32

Here's a prototype of what those conversion functions might look like for kernel memory.  The user variants would be the same, just s/kread/uread/.

(And now I see that uread() doesn't exist, but it should...)
Comment 3 Josh Stone 2012-08-11 01:04:22 UTC
15ceae2 loc2c-runtime.h: Add uread() and uwrite()
8987b30 PR14396: Add UTF-16/32 conversion functions
6561d8d PR14396: Add pretty-printing support for UTF

$ stap -e 'probe process.function("main") { log($foo$); log($bar$) }' -c ./utf
"cow"
"bear"