Bug 11751

Summary: on s390x, externalvar.exp gets libdwfl failures
Product: systemtap Reporter: David Smith <dsmith>
Component: translatorAssignee: Unassigned <systemtap>
Status: RESOLVED FIXED    
Severity: normal CC: fche
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: s390x Target:
Build: Last reconfirmed:

Description David Smith 2010-06-24 18:07:16 UTC
Here's the output of the externalvar.exp testcase on a RHEL5 s390x machine
(running with elfutils-0.137-3.el5):

Host: Linux ibm-z10-03.rhts.eng.bos.redhat.com 2.6.18-194.el5 #1 SMP Tue Mar 16
22:05:06 EDT 2010 s390x s390x s390x GNU/Linux
Snapshot: version 1.2/0.137 non-git sources
GCC: 4.1.2 [gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48)]
Distro: Red Hat Enterprise Linux Server release 5.5 (Tikanga)

Running /root/src/testsuite/systemtap.base/externalvar.exp ...
Executing on host: gcc /root/src/testsuite/systemtap.base/externalvar_lib.c  -g
-O0 -fPIC -shared  -lm   -o /root/s390x/testsuite/libexternalvar.so    (timeout
= 300)
Executing on host: gcc /root/src/testsuite/systemtap.base/externalvar.c  -g -O0
-L/root/s390x/testsuite -lexternalvar -Wl,-rpath,/root/s390x/testsuite  -lm   -o
/root/s390x/testsuite/externalvar    (timeout = 300)
cmd: stap -d /root/s390x/testsuite/libexternalvar.so -d
/root/s390x/testsuite/externalvar -c /root/s390x/testsuite/externalvar
/root/src/testsuite/systemtap.base/externalvar.stp
cmd output: semantic error: libdwfl failure (dwfl_addrmodule): no error:
identifier '$libvar' at /root/src/testsuite/systemtap.base/externalvar.stp:24:27
        source:   printf("libvar = %d\n", $libvar);
                                          ^
semantic error: libdwfl failure (dwfl_addrmodule): no error: identifier '$lib_s'
at :26:29
        source:   printf("lib_s->i = %d\n", $lib_s->i);
                                            ^
semantic error: libdwfl failure (dwfl_addrmodule): no error: identifier '$lib_s'
at :27:29
        source:   printf("lib_s->l = %d\n", $lib_s->l);
                                            ^
semantic error: libdwfl failure (dwfl_addrmodule): no error: identifier '$lib_s'
at :28:29
        source:   printf("lib_s->c = %d\n", $lib_s->c);
                                            ^
semantic error: libdwfl failure (dwfl_addrmodule): no error: identifier '$lib_s'
at :30:7
        source:   if ($lib_s == $lib_s->s1)
                      ^
semantic error: libdwfl failure (dwfl_addrmodule): no error: identifier '$lib_s'
at :38:32
        source:   printf("lib_s->s2 = 0x%x\n", $lib_s->s2);
                                               ^
Pass 2: analysis failed.  Try again with another '--vp 01' option.
FAIL: externalvar
line 1: expected "exevar_c = 42", Got "semantic error: libdwfl failure
(dwfl_addrmodule): no error: identifier '$libvar' at
/root/src/testsuite/systemtap.base/externalvar.stp:24:27"
testcase /root/src/testsuite/systemtap.base/externalvar.exp completed in 1 seconds

(Note that this testcase doesn't pass on RHEL5 x86_64/ppc64 either, but because
systemtap can't find the variables in the shared library during pass 5, not
because of libdwfl errors in pass 2.)
Comment 1 David Smith 2010-06-28 14:02:33 UTC
I've retested this with elfutils-0.141 (the original report was against 0.137)
and I see the same problem.
Comment 2 Mark Wielaard 2010-06-28 14:08:11 UTC
(In reply to comment #1)
> I've retested this with elfutils-0.141 (the original report was against 0.137)
> and I see the same problem.

Could you try against a systemtap that includes:

commit e2d2c7b8ce287c62ca37c9a744341f7f5babd1ca
Author: Mark Wielaard <mjw@redhat.com>
Date:   Thu Jun 24 21:00:56 2010 +0200

    Improve dwflpp::emit_address internal error message.
    
    * dwflpp.cxx (dwflpp::emit_address): Be more explicit about context and
      what failed in case of internal error.

That should give us a bit better error message about what is going wrong here.
Comment 3 David Smith 2010-06-28 18:07:09 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > I've retested this with elfutils-0.141 (the original report was against 0.137)
> > and I see the same problem.
> 
> Could you try against a systemtap that includes:
> 
> commit e2d2c7b8ce287c62ca37c9a744341f7f5babd1ca

Here you go:

Running /root/src/testsuite/systemtap.base/externalvar.exp ...
Executing on host: gcc /root/src/testsuite/systemtap.base/externalvar_lib.c  -g
-O0 -fPIC -shared  -lm   -o /root/s390x/testsuite/libexternalvar.so    (timeout
= 300)
Executing on host: gcc /root/src/testsuite/systemtap.base/externalvar.c  -g -O0
-L/root/s390x/testsuite -lexternalvar -Wl,-rpath,/root/s390x/testsuite  -lm   -o
/root/s390x/testsuite/externalvar    (timeout = 300)
cmd: stap -d /root/s390x/testsuite/libexternalvar.so -d
/root/s390x/testsuite/externalvar -c /root/s390x/testsuite/externalvar
/root/src/testsuite/systemtap.base/externalvar.stp
cmd output: semantic error: emit_address internal error, dwfl_addrmodule failed,
address 0x1a70: identifier '$libvar' at
/root/src/testsuite/systemtap.base/externalvar.stp:24:27
        source:   printf("libvar = %d\n", $libvar);
                                          ^
semantic error: emit_address internal error, dwfl_addrmodule failed, address
0x1a68: identifier '$lib_s' at :26:29
        source:   printf("lib_s->i = %d\n", $lib_s->i);
                                            ^
semantic error: emit_address internal error, dwfl_addrmodule failed, address
0x1a68: identifier '$lib_s' at :27:29
        source:   printf("lib_s->l = %d\n", $lib_s->l);
                                            ^
semantic error: emit_address internal error, dwfl_addrmodule failed, address
0x1a68: identifier '$lib_s' at :28:29
        source:   printf("lib_s->c = %d\n", $lib_s->c);
                                            ^
semantic error: emit_address internal error, dwfl_addrmodule failed, address
0x1a68: identifier '$lib_s' at :30:7
        source:   if ($lib_s == $lib_s->s1)
                      ^
semantic error: emit_address internal error, dwfl_addrmodule failed, address
0x1a68: identifier '$lib_s' at :38:32
        source:   printf("lib_s->s2 = 0x%x\n", $lib_s->s2);
                                               ^
Pass 2: analysis failed.  Try again with another '--vp 01' option.
FAIL: externalvar
line 1: expected "exevar_c = 42", Got "semantic error: emit_address internal
error, dwfl_addrmodule failed, address 0x1a70: identifier '$libvar' at
/root/src/testsuite/systemtap.base/externalvar.stp:24:27"
testcase /root/src/testsuite/systemtap.base/externalvar.exp completed in 0 seconds
Comment 4 Mark Wielaard 2010-06-30 08:54:37 UTC
(In reply to comment #3)
> Here you go:
> 
> Running /root/src/testsuite/systemtap.base/externalvar.exp ...
> Executing on host: gcc /root/src/testsuite/systemtap.base/externalvar_lib.c  -g
> -O0 -fPIC -shared  -lm   -o /root/s390x/testsuite/libexternalvar.so    (timeout
> = 300)
> Executing on host: gcc /root/src/testsuite/systemtap.base/externalvar.c  -g -O0
> -L/root/s390x/testsuite -lexternalvar -Wl,-rpath,/root/s390x/testsuite  -lm   -o
> /root/s390x/testsuite/externalvar    (timeout = 300)
> cmd: stap -d /root/s390x/testsuite/libexternalvar.so -d
> /root/s390x/testsuite/externalvar -c /root/s390x/testsuite/externalvar
> /root/src/testsuite/systemtap.base/externalvar.stp
> cmd output: semantic error: emit_address internal error, dwfl_addrmodule failed,
> address 0x1a70: identifier '$libvar' at
> /root/src/testsuite/systemtap.base/externalvar.stp:24:27
>         source:   printf("libvar = %d\n", $libvar);
>                                           ^

OK, so this is for all the variables in the library. We think we found, libvar
in this case, at address 0x1a70, we try to translate it to a location
expression, so we ask libdwfl about it to determine which dwfl module exactly it
is in. But then dwfl_addrmodule fails for some reason (but without it setting an
error code).

Could you try to build the libexternalvar.so as above (or just comment out the
rm in the exp file) and see waht loc2c-test says about it?

$ ./loc2c-test testsuite/libexternalvar.so 0x1a70

$ ./loc2c-test testsuite/libexternalvar.so 0x1a70 libvar

and what does eu-readelf say about the lib_call function?

$ eu-readelf -s testsuite/libexternalvar.so | grep lib_call
Comment 5 David Smith 2010-06-30 14:01:16 UTC
> OK, so this is for all the variables in the library. We think we found, libvar
> in this case, at address 0x1a70, we try to translate it to a location
> expression, so we ask libdwfl about it to determine which dwfl module exactly it
> is in. But then dwfl_addrmodule fails for some reason (but without it setting an
> error code).
> 
> Could you try to build the libexternalvar.so as above (or just comment out the
> rm in the exp file) and see waht loc2c-test says about it?
> 
> $ ./loc2c-test testsuite/libexternalvar.so 0x1a70
> 
> $ ./loc2c-test testsuite/libexternalvar.so 0x1a70 libvar

After adding '-e', I get:

# ./loc2c-test -e testsuite/libexternalvar.so 0x1a70
./loc2c-test: dwfl_addrdie: address out of range
# ./loc2c-test -e testsuite/libexternalvar.so 0x1a70 libvar
./loc2c-test: dwfl_addrdie: address out of range

> and what does eu-readelf say about the lib_call function?
> 
> $ eu-readelf -s testsuite/libexternalvar.so | grep lib_call

# eu-readelf -s testsuite/libexternalvar.so | grep lib_call
   48: 000000000000062c     18 FUNC    LOCAL  DEFAULT       10 lib_call
Comment 6 Mark Wielaard 2010-06-30 14:08:48 UTC
(In reply to comment #5)
> # ./loc2c-test -e testsuite/libexternalvar.so 0x1a70
> ./loc2c-test: dwfl_addrdie: address out of range
> # ./loc2c-test -e testsuite/libexternalvar.so 0x1a70 libvar
> ./loc2c-test: dwfl_addrdie: address out of range
> 
> > and what does eu-readelf say about the lib_call function?
> > 
> > $ eu-readelf -s testsuite/libexternalvar.so | grep lib_call
> 
> # eu-readelf -s testsuite/libexternalvar.so | grep lib_call
>    48: 000000000000062c     18 FUNC    LOCAL  DEFAULT       10 lib_call

thanks. so we "guessed" the address of the lib_call function wrongly. how odd. I
assume ./loc2c-test -e testsuite/libexternalvar.so 0x62c libvar does spit out
the location information for libvar?

Could you run stap -vvvvv -p4 -e 'probe
process("testsuite/libexternalvar.so").function("lib_call") { printf("libvar =
%d\n", $libvar); }' to see how/what/where we get that 0x1a70 address in the
first place?
Comment 7 David Smith 2010-06-30 14:19:56 UTC
(In reply to comment #6)
> > > $ eu-readelf -s testsuite/libexternalvar.so | grep lib_call
> > 
> > # eu-readelf -s testsuite/libexternalvar.so | grep lib_call
> >    48: 000000000000062c     18 FUNC    LOCAL  DEFAULT       10 lib_call
> 
> thanks. so we "guessed" the address of the lib_call function wrongly. how odd. I
> assume ./loc2c-test -e testsuite/libexternalvar.so 0x62c libvar does spit out
> the location information for libvar?

# ./loc2c-test -e testsuite/libexternalvar.so 0x62c libvar
#define PROBEADDR 0x62cULL
static void print_value(struct pt_regs *regs)
{
  intptr_t value;
  {
    uintptr_t addr;
    { // DWARF expression: 0x3(6768)
      {
        uintptr_t s0;
        s0 = 0x1a70UL;
        addr = s0;
      }
    }
    { int32_t value = deref (4, addr); value = value; }
  }
  /* max expression stack depth 1 */
  printk (" ---> %ld\n", (unsigned long) value);
  return;

 deref_fault:
  printk (" => BAD ACCESS\n");
}
 
> Could you run stap -vvvvv -p4 -e 'probe
> process("testsuite/libexternalvar.so").function("lib_call") { printf("libvar =
> %d\n", $libvar); }' to see how/what/where we get that 0x1a70 address in the
> first place?

# stap -vvvvv -p4 -e 'probe
process("testsuite/libexternalvar.so").function("lib_call") { printf("libvar =
%d\n", $libvar); }'
SystemTap translator/driver (version 1.2/0.141 non-git sources)
Copyright (C) 2005-2010 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
Session arch: s390 release: 2.6.18-194.el5
Created temporary directory "/tmp/stapwJ00Va"
Parsed kernel "/lib/modules/2.6.18-194.el5/build/.config", number of tuples: 744
Parsed kernel "/lib/modules/2.6.18-194.el5/build/Module.symvers", number of
vmlinux exports: 2613
Searched "/usr/local/share/systemtap/tapset/s390/*.stp", found 3
Searched "/usr/local/share/systemtap/tapset/*.stp", found 68
Pass 1: parsed user script and 71 library script(s) using
29372virt/20904res/2628shr kb, in 190usr/10sys/204real ms.
dwarf_builder::build for /root/s390x/testsuite/libexternalvar.so
parsed 'lib_call', func 'lib_call'
pattern '/root/s390x/testsuite/libexternalvar.so' matches module
'/root/s390x/testsuite/libexternalvar.so'
focused on module '/root/s390x/testsuite/libexternalvar.so = [0x10000-0x11a78,
bias 0x0] file /root/s390x/testsuite/libexternalvar.so ELF machine s390| (code 22)
focused on module '/root/s390x/testsuite/libexternalvar.so'
module function cache /root/s390x/testsuite/libexternalvar.so size 2
module function cache /root/s390x/testsuite/libexternalvar.so hit lib_call
selected function lib_call
function cache
/root/s390x/testsuite/libexternalvar.so:/root/src/testsuite/systemtap.base/externalvar_lib.c
size 2
function cache
/root/s390x/testsuite/libexternalvar.so:/root/src/testsuite/systemtap.base/externalvar_lib.c
hit lib_call
selected function lib_call
prologue searching function 'lib_call'
0x62c-0x63e@/root/src/testsuite/systemtap.base/externalvar_lib.c:28
checking line record 0x62c@/root/src/testsuite/systemtap.base/externalvar_lib.c:28
checking line record 0x636@/root/src/testsuite/systemtap.base/externalvar_lib.c:30
prologue found function 'lib_call' = 0x636
probe lib_call@/root/src/testsuite/systemtap.base/externalvar_lib.c:28
process=/root/s390x/testsuite/libexternalvar.so reloc=.dynamic pc=0x636
entry-pc lookup (dwarf_entrypc dieoffset: 0x17c) = 0x62c (rc 0)
finding location for local 'libvar' near address 0x636, module bias 0x10000
get_cfa_ops @0x10636, module_start @0x10000
not found cfa
dwarf_builder releasing user dwflpp /root/s390x/testsuite/libexternalvar.so
Eliding side-effect-free singleton block operator '{' at <input>:1:67
Resolution problem with probe
process("/root/s390x/testsuite/libexternalvar.so").function("lib_call@/root/src/testsuite/systemtap.base/externalvar_lib.c:28")
printf("libvar = %d\\n", $libvar)
semantic error: emit_address internal error, dwfl_addrmodule failed, address
0x1a70: identifier '$libvar' at <input>:1:93
        source: probe
process("testsuite/libexternalvar.so").function("lib_call") { printf("libvar =
%d\n", $libvar); }
                                                                               
                            ^
Pass 2: analyzed script: 1 probe(s), 0 function(s), 0 embed(s), 0 global(s)
using 29912virt/22076res/3304shr kb, in 10usr/0sys/16real ms.
Pass 2: analysis failed.  Try again with another '--vp 01' option.
Running rm -rf /tmp/stapwJ00Va
Spawn waitpid result (0x802): 0
Comment 8 Frank Ch. Eigler 2011-09-23 17:56:37 UTC
Still reproducible with elfutils 0.152, stap 1.6.
Comment 9 David Smith 2013-09-10 16:50:28 UTC
This is no longer reproducible with elfutils-0.152 and HEAD systemtap (post 2.3). Tested on s390x 2.6.18-348.el5.