[PATCH] Use mmap for symbol tables

Jim Blandy jimb@red-bean.com
Mon Jan 30 22:11:00 GMT 2006


On 1/30/06, Eirik Fuller <eirik@hackrat.com> wrote:
> > However, doesn't that mean that changes to the data by other processes
> > (say) would become visible to GDB?  What happens when you recompile
> > the program while GDB's running?
>
> In our environment that's very rare (once a release ships, we really
> need the symbol table to match what shipped), enough that I hadn't even
> given it much thought, but I have a couple of offhand reactions.
>
> One, it's roughly the same problem as NFS server outages (yes, the
> details will vary; in our environment NFS outages are far more common
> than symbol table clobberage).
>
> Two, if the build process unlinks the symbol table before commencing the
> link, gdb won't see the changes (at least not in mmap'd data).  Renaming
> the symbol table is better than unlinking, all in all, but unlinking it
> is sufficient to rely on the unlinked open files feature (which, with
> NFS, isn't enough; the conventional .nfs turd file hack is useless when
> a different client unlinks the file).  I just double checked; our
> Makefiles do remove the old symbol table first.

For inclusion in the public GDB sources, I'd want GDB's sensitivity to
the files being changed out from underneath it to be unaffected,
regardless of the details of peoples' build processes.  Right now, GDB
effectively has its own copy of the symbol data it will use until it
notices that the file has changed and re-reads it.  GDB shouldn't
crash just because someone does '> a.out' on it.  It's better to get
an error from bfd_seek or bfd_bread than a segfault from trying to
access file blocks that aren't there.

Have you explained the disadvantages to simply using MAP_PRIVATE?  If
you have, I missed it.  Does MAP_PRIVATE require the whole file to be
read before the mmap system call can return, like 'read' does?

> In an earlier message I commented about problems which might occur while
> fleshing out partial symbols, if a symbol table becomes unavailable.  In
> that commentary I assumed that gdb reads from the symbol table when it
> promotes partial symbols to symbols; a quick glance at the code suggests
> I assumed correctly.  Does gdb handle a symbol table which changes out
> from under it when it fleshes out partial symbols?  If not, then the
> mmap patch doesn't make things fundamentally worse; it's just a matter
> of degree.  If gdb does handle changes to a symbol table gracefully,
> then I wonder if the way it does that can somehow be extended to mmap.

At least with Dwarf 2, which is the format used by default these days,
GDB doesn't re-read anything at psymtab-to-symtab conversion time.  It
keeps the whole thing in memory.

> I've already mentioned that the wasted address space isn't all that big,
> at least not in the symbol tables I'm accustomed to.  Anyone who is
> crowding the limits of virtual address space will run out soon enough
> whether they use malloc/seek/read or mmap; the best long term answer,
> short of a completely different symbol table format (one which doesn't
> require slurping the entire file to build an index that belongs in the
> file format to begin with), is to buy amd64 processors.  I've already
> bought myself a dual Opteron system for similar reasons, but that has
> more to do with multi-gigabyte core files than big symbol tables.

I'm glad it's not a problem for you, but I'm not sure that's the best
answer for all of GDB's users.

> One concern I have about extra complication to mmap pieces of the file
> is that it could conceivably use more address space rather than less
> address space.  If different parts of gdb use overlapping regions of the
> symbol table

They don't, I'm pretty sure.

> I really should gather some timing information and pass it along.  I'll
> try to do that today.

Great --- I'd love to see some timings.



More information about the Gdb-patches mailing list