This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Re: request for help debugging a segfault in _dl_relocate_object
- From: James Washer <washer at trlp dot com>
- To: Ryan Arnold <ryan dot arnold at gmail dot com>, libc-help at sourceware dot org
- Date: Fri, 02 May 2008 13:03:39 -0700
- Subject: Re: request for help debugging a segfault in _dl_relocate_object
- References: <1209407182.5184.78.camel@p6.trlp1.com> <ff4da150804291310u66d1a919w6bdf453d24735189@mail.gmail.com> <1209507555.5184.116.camel@p6.trlp1.com>
On Tue, 2008-04-29 at 15:19 -0700, James Washer wrote:
> Here's the stack, and the hexdump of the map_info (I cannot get gdb to
> give it to me symbolically)
> #10 <signal handler called>
> #11 0x00348c0f in _dl_relocate_object () from /lib/ld-linux.so.2
> #12 0x0045667a in dl_open_worker () from /lib/tls/libc.so.6
> #13 0x0034b5ee in _dl_catch_error () from /lib/ld-linux.so.2
> #14 0x00456d48 in _dl_open () from /lib/tls/libc.so.6
> #15 0x00487cb8 in dlopen_doit () from /lib/libdl.so.2
> #16 0x0034b5ee in _dl_catch_error () from /lib/ld-linux.so.2
> #17 0x004882bb in _dlerror_run () from /lib/libdl.so.2
> #18 0x00487d11 in dlopen@@GLIBC_2.1 () from /lib/libdl.so.2
> #19 0x00211395 in j9sl_open_shared_library ()
> from /pmrdata/67180,070,724/core1/opt/netcool/platform/linux2x86/jre_1.5.4/jre/bin/libj9prt23.so
> #20 0x00b25ff2 in classLoaderRegisterLibrary ()
> from /pmrdata/67180,070,724/core1/opt/netcool/platform/linux2x86/jre_1.5.4/jre/bin/libj9vm23.so
> #21 0x00b572f8 in openNativeLibrary ()
> from /pmrdata/67180,070,724/core1/opt/netcool/platform/linux2x86/jre_1.5.4/jre/bin/libj9vm23.so
> #22 0x00b57627 in registerNativeLibrary ()
> from /pmrdata/67180,070,724/core1/opt/netcool/platform/linux2x86/jre_1.5.4/jre/bin/libj9vm23.so
> #23 0x00ef557c in java_lang_ClassLoader_loadLibraryWithPath ()
> from /pmrdata/67180,070,724/core1/opt/netcool/platform/linux2x86/jre_1.5.4/jre/bin/libjclscar_23.so
> #24 0x09479000 in ?? ()
> #25 0x09490f20 in ?? ()
> #26 0x096f3880 in ?? ()
> #27 0x096e3d30 in ?? ()
> #28 0x00000000 in ?? ()(gdb) x/40x 0x9627d60
> 0x9627d60: 0x00f31000 0x09620ff0 0x0106d228 0x00000000
> 0x9627d70: 0x09593428 0x09627d60 0x00000000 0x09627fa0
> 0x9627d80: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x9627d90: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x9627da0: 0x0106d2c8 0x00000000 0x00000000 0x00000000
> 0x9627db0: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x9627dc0: 0x0106d2a0 0x0106d2a8 0x00000000 0x00000000
> 0x9627dd0: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x9627de0: 0x00000000 0x00000000 0x00000000 0x00000000
> 0x9627df0: 0x00000000 0x00000000 0x00000000 0x00000000
>
> And just to show that it "looks" reasonable based on the pointer to the
> file name:
> (gdb) x/1s 0x09620ff0
> 0x9620ff0:
> "/opt/netcool/platform/linux2x86/jre_1.5.4/jre/bin/libawt.so"
> (gdb)
>
>
>
> On Tue, 2008-04-29 at 15:10 -0500, Ryan Arnold wrote:
> > On Mon, Apr 28, 2008 at 1:26 PM, James Washer <washer@trlp.com> wrote:
> > > I'm fairly new at digging through libc code and have been trying to
> > > determine the cause of a segfault in _dl_relocate_object.
> > >
> > > On a system running RHEL4U6 we have an application coredump from a
> > > segfault. glibc-2.3.4-2.38 provides the ld-linux.so that I'm looking at
> > >
> > >
> > >
> > > The program in question has done a dlopen on
> > > "/opt/netcool/platform/linux2x86/jre_1.5.4/jre/bin/libawt.so" with flags
> > > of RTLD_NOW.
> > >
> > > Further up the stack (or down the stack if you prefer) we end up in
> > > _dl_relocate_object and segfault on a null pointer. The pointer in
> > > question came from
> > >
> > >
> > > const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
> > >
> > > See near the bottom of this disassembly the comment "BANG" showing the
> > > instruction that faulted.
> > >
> > > Any pointers (no pun intended) as to why l_info[DT_STRTAB] might be null
> > > would be appreciated.
> >
> > Hi James,
> >
> > A backtrace would be useful so that we can determine which invocation
> > _dl_relocate_object() seg faulted. I'm interested in where the
> > link_map pointer came from.
> >
> > Have you tried reproducing this using a more recent GLIBC? Glibc 2.3
> > is quite old.
> >
> > Regarding this:
> >
> > const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
> >
> > What is NULL; the pointer `l' or the value coming out of l_info[DT_STRTAB]?
> >
> > Unfortunately x86 asm isn't my strong point but I'll help if I can.
> >
> > Ryan S. Arnold
> >