This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: MIPS sign extension of addresses



Most of the problems I fixed had to do with the fact that BFD takes
the 32 bit unsigned addresses from object and executable files, sign
extends them, and then stores the result as a bfd_vma, which is an
unsigned 64 bit type (unsigned long long).  For example, the unsigned
32 bit address 0x80020004 becomes an unsigned 64 bit bfd_vma/CORE_ADDR
of 0xffffffff80020004.  The bfd_vma type is used to define gdb's
CORE_ADDR types.
For MIPS, any 32 bit address is signed and should be sign extended. BFD and GDB should both mimic this behavour.

If, for some reason, you encounter a CORE_ADDR that looks like it wasn't sign extended, then failing to sign extend the value will be the most likely source of the bug.

I suppose this might make more sense to me if I either had more
historical knowledge about why it does this, or if bfd_vma/CORE_ADDR
were signed types.

The trigger for this behavior in BFD is the field called
sign_extend_vma in the elf backend data, which apparently is only set
by elf32-mips.c, elf32-sh64.c, and elfn32-mips.c.  This special
treatment for mips seems to also be why we need the regcache to
support a read_signed_register call that is only used in mips-tdep.c.

There are also some ADDR_BITS_REMOVE macros used in mips-tdep.c that
one might think would be there to strip the high bits but really don't
since that macro invokes mips_mask_address_p() which tests
mask_address_var which defaults to AUTO_BOOLEAN_AUTO which uses
MIPS_DEFAULT_MASK_ADDRESS_P which checks the multiarch value of
default_mask_address_p which seems to be set to zero everywhere.

Is all of this just internal gdb handwaving that can be changed/fixed
or are there external reasons for all this sign extending followed by
selective discarding/ignoring of the extended bits?  Are there files
that will break or hardware that will misbehave if BFD stops doing
this sign extension?
In the ``bad old days'', GDB/BFD tried to pretend that MIPS addresses were not signed and did not bother to sign extend them. As a consequence, many things simply didn't work. For instance, given an o32 executable being run on a 64 bit target, the register would contain 0xffffffff80020004 yet the symbol table would contain 0x000000008002004. Most of the ``gdb handwaving'' you see came about, instead of fixing a GDB/BFD design flaw (and sign-extending the address), people kept adding yet another one more wafer thin fix.... To make matters worse, GDB et.al. were also trying to debug 64 bit abi's (o64) while using 32 bit debug info.

Anyway, internally the addresses are now always sign extended. All external addresses are converted to/from sign-extended values before they get to GDB's core: when displaying it is trimmed back to TARGET_ADDR_BIT; on the target side (remote.c) where due to protocol restrictions the address may need to be trimmed; when converting to/from pointers (void*) and addresses (CORE_ADDR).

As a consequence, single GDB executable can, on IRIX 6.5, debug -64 -n32 and -o32 binaries!

After getting a little feedback from some private email exchanges
containing substantially the same info as above, I've modified my
mental picture of this process to think of it as a simple address
translation scheme.  I.E. when running a 32 bit binary in a 64 bit
address space, effectively the 32 bit address space is split in half,
with the lower half (0x00000000-0x7FFFFFFF) mapped to the bottom of
the 64 bit space and the upper half (0x80000000-0xFFFFFFFF) mapped to
the top of the 64 bit space.
Yes.  That is a good way of describing it.

I suppose this makes sense in a toolchain where you want the same gdb
to be able to handle both 64 and 32 bit mips.  I'm fairly unfamiliar
with all the different possible mips configurations, but perhaps it
would be less confusing if there was one that was strictly 32 bit
internally, much like when configuring and building a native
i686-pc-linux-gnu toolchain uses an "unsigned long int" type for
bfd_vma instead of an "unsigned long long int".
Both bfd_vma and CORE_ADDR are at least as large as TARGET_ADDR_BIT. That is the assertion:
sizeof (CORE_ADDR) * HOST_CHAR_BIT >= TARGET_ADDR_BIT
holds. Even for i386, an assumption such as sizeof (CORE_ADDR) == 32 is invalid.

enjoy,
Andrew



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]