This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Re: Non-uniform address spaces
Jim Blandy wrote:
The compiler certainly can identify that an array or other data
is shared, to use UPC's terminology. From there, the target code
would need to perform some magic to figure out where the address
actually pointed to.
Certainly, an ABI informs the interpretation of the debugging info.
Do you have specific ideas yet on how to convey this information?
A hook (specified in gdb_arch) would specify a target routine
to do the translation. When GDB sees a shared pointer, it
will call this target-dependent translation routine.
What isn't clear to me is where to call the hook. Suggestions
about where to look would be welcome.
There are other places where an address is incremented, such as
in displaying memory contents. I doubt that the code knows
what what it is displaying, only to display n words starting at
x address in z format. This would probably result in incorrect
results if the data spanned from one processor/thread to another.
(At least at a first approximation, this may well be an acceptable
restriction.)
Certainly code for printing distributed objects will need to
understand how to traverse them properly; I see this as parallel to
the indexing/pointer arithmetic requirements. Hopefully we can design
one interface that serves both purposes nicely.
Perhaps. I haven't looked in this code for a long time, but
my impression is that knowledge about what is being printed
gets discarded pretty early, leaving only a pointer, a count,
and a format.
Symtab code would need a hook which converted the ELF
<section,offset> into a <processor,thread,offset> for shared
objects. Again, that would require target-dependent magic.
Hmm. GDB's internal representation for debugging information stores
actual addresses, not <section, offset> pairs. After reading the
information, we call objfile_relocate to turn the values read from the
debugging information into real addresses. It seems to me that that
code should be doing this job already.
Perhaps. I'll look at that. How does this work for TLS now?
How does code get loaded in your system? Does a single module get
loaded multiple times?
On a system which has shared memory (not UPC 'shared' but memory which
is accessed by all processors/threads) the code image is simply loaded.
Data areas are duplicated for thread-specific data, similar to TLS.
On multi-processors systems which have independent memories, a target
agent loads the processors with the executable.
In GDB, each objfile represents a specific loading of a library or
executable. The information is riddled with real addresses. If a
single file is loaded N times, you'll need N objfiles, and the
debugging information will be duplicated.
Likely not a real problem. The code image is linear and addresses
don't need to be translated. Addresses in the code are relative to
either global data or thread-specific data. They aren't NUMA
addresses.
In the long run, I think GDB should change to represent debugging
information in a loading-independent way, so that multiple instances
of the same library can share the same data. In a sense, you'd have a
big structure that just holds data parsed from the file, and then a
bunch of little structures saying, "I'm an instance of THAT, loaded at
THIS address."
This would enable multi-process debugging, and might also allow us to
avoid re-reading debugging info for shared libraries every time they
get loaded.
This would address my comment above, that GDB converts from a
symbolic form to an address too early.
One problem may be that it may not be clear whether one has a
pointer to a linear code space or to a distributed NUMA data space.
It might be reasonable to model the linear code space as a 64-bit
CORE_ADDR, with the top half zero, while a NUMA address has non-zero
values in the top half. (I don't know if there might be alias
problems, where zero might be valid for the top half of a NUMA address.)
I think this isn't going to be a problem, but it's hard to tell. Can
you think of a specific case where we wouldn't be able to tell which
we have?
Only if the <processor,thread> component of a NUMA address can be
zero, and looks like a linear address.
I'd be very happy figuring out where to put a hook which allowed me
to translate a NUMA CORE_ADDR into a physical address, setting the
thread appropriately. A bit of a kludge, but probably workable.
CORE_ADDR should be capable of addressing all memory on the system. I
think you'll make a lot of trouble for yourself if you don't follow
that rule.
The NUMA address has to be translated into a physical address somewhere.
Perhaps lower in the access routines is better.
--
Michael Eager eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306 650-325-8077