What exactly does "info sharedlibrary" command show?

Tue Aug 29 20:06:00 GMT 2006

> Date: Tue, 29 Aug 2006 15:27:58 -0400
> From: Daniel Jacobowitz <drow@false.org>
> 
> On Tue, Aug 29, 2006 at 09:14:24PM +0200, Mark Kettenis wrote:
> > > Date: Tue, 29 Aug 2006 08:39:54 -0400
> > > From: Daniel Jacobowitz <drow@false.org>
> > > 
> > > On Tue, Aug 29, 2006 at 08:03:11PM +0800, chen free wrote:
> > > > I believe they are not the physical address, right?
> > > > 
> > > > If they are virtual memory address, why they are different from the
> > > > info from /proc/{PID}/maps? the {PID} is the specific program process
> > > > ID.
> > > 
> > > They are the beginning and end of ".text" in those loaded libraries.
> > > 
> > > I've been thinking about changing them to be segment addresses...
> > 
> > What do you mean with this?
> 
> Lots of parts of GDB are section based.  On modern ELF systems, this is
> rarely right.  Instead, things should often be segment-based.  A shared
> library is mapped according to the PT_LOAD headers, and they describe
> what addresses it really occupies - and what value it's relocated by,
> not coincidentally.

OK, that's what I was afraid of ;-).

> Suppose we have this mapping:
> 2aaaaabc3000-2aaaaace4000 r-xp 00000000 fd:00 57048 /lib/libc-2.3.6.so
> 2aaaaace4000-2aaaaade3000 ---p 00121000 fd:00 57048 /lib/libc-2.3.6.so
> 2aaaaade3000-2aaaaadf8000 r--p 00120000 fd:00 57048 /lib/libc-2.3.6.so
> 2aaaaadf8000-2aaaaadfb000 rw-p 00135000 fd:00 57048 /lib/libc-2.3.6.so
> 
> That's the PT_LOAD text segment, followed by some unmapped space
> between segments (for alignment), followed by the PT_LOAD data
> segment (part of which has been marked read only, according to
> PT_GNU_RELRO, readonly after relocation).

Ah, but there can be more than two PT_LOAD segments.  On OpenBSD/i386
we have for example:

  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00000000 0x00000000 0x81e06 0x81e06 R E 0x1000
  LOAD           0x082000 0x20000000 0x20000000 0x0c3e2 0x0c3e2 R   0x1000
  LOAD           0x08f000 0x2000d000 0x2000d000 0x02c04 0x02c04 RW  0x1000
  LOAD           0x091c04 0x20010c04 0x20010c04 0x00c84 0x00c84 RW  0x1000
  LOAD           0x0928a0 0x200128a0 0x200128a0 0x00000 0x1e660 RW  0x1000
  DYNAMIC        0x091b64 0x2000fb64 0x2000fb64 0x000a0 0x000a0 RW  0x4

Note the big gap between the executable segment and the other
segments.  This is true for all shared libraries on OpenBSD/i386 and
the idea is to map all executable code below a certain virtual address
and all non-executable segments above that address.  That makes it
possible to guarantee that writable pages will never be executable,
something we call W^X which is similar to PAX (but works instead).
The result is that shared libraries will sort of overlap.

> It seems to me that "info shared" ought to say this library is mapped
> at 0x2aaaaabc3000 - 0x2aaaaaddfb000, or at least 0x2aaaaabc3000
> to 0x2aaaaace4000.  The latter is more portably reliable, some
> platforms separate the segments at relocation time, but SysV ones
> of course do not.

The first approach would lead to much confusion since it would lead to
seemingly overlapping shared libraries on OpenBSD/i386.  The latter
approach is better in that sense.

> Right now we say it occupies 0x00002aaaaabdf2d0 to 0x00002aaaaacc1a10.

I like this though, since I mostly search the list for code addresses.

> It always bugs me that we show the "wrong" offsets, especially when
> I need to do relocation calculations by hand for some reason; I do a
> lot of adding and subtracting the start of .text.

True.  Somehow we should make the load address of a shared library
available.

Mark