Bug 12566 - usymbols.exp 64-bit test failing on ppc64
Summary: usymbols.exp 64-bit test failing on ppc64
Status: RESOLVED WORKSFORME
Alias: None
Product: systemtap
Classification: Unclassified
Component: runtime (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-10 19:42 UTC by David Smith
Modified: 2023-12-06 15:48 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Smith 2011-03-10 19:42:47 UTC
After fixing the testcase problem outlined here:

BZ639344 (usymbols.exp fails on ppc and s390x)
<https://bugzilla.redhat.com/show_bug.cgi?id=639344>

the usymbols.exp fails with a real problem on ppc64 (2.6.32-120.el6.s390x and 2.6.32.12-115.fc12.ppc64).

====
Snapshot: version 1.5 /0.152 commit release-1.4-143-gbd8504c + changes
GCC: 4.4.5 [gcc (GCC) 4.4.5 20110214 (Red Hat 4.4.5-6)]
Distro: Red Hat Enterprise Linux Server release 6.1 Beta (Santiago)

Running /root/src/testsuite/systemtap.context/usymbols.exp ...
FAIL: usymbols -m64
PASS: usymbols -m32

		=== systemtap Summary ===

# of expected passes		1
# of unexpected failures	1
====

The 64-bit test is failing because the script outputs:

====
handler: 0x10010c48 (<unknown>)
handler: lib_handler (/root/ppc64/testsuite/libusymbols-m64.so)
====

Instead of something like:

====
handler: main_handler (/root/ppc64/testsuite/usymbols-m64)
handler: lib_handler (/root/ppc64/testsuite/libusymbols-m64.so)
====

From looking at the test executable, 0x10010c48 is the correct address for main_handler so systemtap should have all the information necessary to do the lookup, but it somehow fails.

Note that this test completely passes on:

2.6.32-71.18.2.el6.x86_64
2.6.32-71.18.2.el6.i686
2.6.32-120.el6.s390x
Comment 1 David Smith 2011-03-11 21:02:13 UTC
Here's some additional information.

In the 32-bit case (which works):

# eu-readelf -s usymbols-m32 | fgrep main_handler
   62: 1000059c      4 FUNC    GLOBAL DEFAULT       12 main_handler
# fgrep main_handler /tmp/stapVlnORz/stap-symbols.h 
  { 0x1000059c, "main_handler" },

The above is good, stap's value of the symbol matches up with eu-readelf's value.

When run under gdb,
(gdb) p &main_handler
$2 = (void (*)(int)) 0x1000059c <main_handler>

That's good, gdb's idea of the symbol value also matches.

# fgrep /usymbols-m32 /proc/9161/maps
10000000-10010000 r-xp 00000000 fd:00 2239275                            /root/ppc64/testsuite/usymbols-m32
10010000-10020000 rw-p 00000000 fd:00 2239275                            /root/ppc64/testsuite/usymbols-m32

That's good, 0x1000059c exists within that first executable vma.

In the 64-bit case (which fails):

# eu-readelf -s usymbols-m64  | fgrep main_handler
   63: 0000000010010c48     16 FUNC    GLOBAL DEFAULT       22 main_handler
# fgrep main_handler /tmp/stapImPRwN/stap-symbols.h 
  { 0x10010c48, "main_handler" },

That's good, the eu-readelf and stap values match.

When run under gb,
(gdb) p &main_handler
$1 = (void (*)(int)) 0x10000700 <main_handler>

That's bad - when run, somehow the address has changed.

# fgrep /usymbols-m64 /proc/9183/maps 
10000000-10010000 r-xp 00000000 fd:00 2239236                            /root/ppc64/testsuite/usymbols-m64
10010000-10020000 rw-p 00000000 fd:00 2239236                            /root/ppc64/testsuite/usymbols-m64

The 0x10010c48 address exists within that 2nd non-executable vma (the code in vma.c only looks at executable vmas), which is why the symbol lookup code fails.  The gdb address (0x10000700) does exist within the 1st executable vma.

Perhaps there is some relocation going on that systemtap needs to know about.
Comment 2 Mark Wielaard 2011-03-11 21:29:16 UTC
(In reply to comment #1)
> In the 64-bit case (which fails):
> 
> # eu-readelf -s usymbols-m64  | fgrep main_handler
>    63: 0000000010010c48     16 FUNC    GLOBAL DEFAULT       22 main_handler
> # fgrep main_handler /tmp/stapImPRwN/stap-symbols.h 
>   { 0x10010c48, "main_handler" },
> 
> That's good, the eu-readelf and stap values match.
> 
> When run under gb,
> (gdb) p &main_handler
> $1 = (void (*)(int)) 0x10000700 <main_handler>
> 
> That's bad - when run, somehow the address has changed.
> 
> # fgrep /usymbols-m64 /proc/9183/maps 
> 10000000-10010000 r-xp 00000000 fd:00 2239236                           
> /root/ppc64/testsuite/usymbols-m64
> 10010000-10020000 rw-p 00000000 fd:00 2239236                           
> /root/ppc64/testsuite/usymbols-m64
> 
> The 0x10010c48 address exists within that 2nd non-executable vma (the code in
> vma.c only looks at executable vmas), which is why the symbol lookup code
> fails.  The gdb address (0x10000700) does exist within the 1st executable vma.
> 
> Perhaps there is some relocation going on that systemtap needs to know about.

The 0x10010c48 address is a pointer into the .odp table. So it is an indirect
address to the real function address. We had a little discussion on making
systemtap deal with that better on the mailinglist:
http://sourceware.org/ml/systemtap/2011-q1/threads.html#00082
Comment 3 William Cohen 2023-12-06 15:48:40 UTC
This is currently working on RHEL8 andRHEL9 for both ppc64le and s390x.