[PATCH/RFC] Parsing auxv entries

Mon Feb 10 16:51:00 GMT 2014

On 02/06/2014 09:33 PM, Mark Kettenis wrote:
>> Date: Thu, 06 Feb 2014 19:03:53 +0000
>> From: Pedro Alves <palves@redhat.com>
>>
>> On 02/04/2014 02:41 PM, Mark Kettenis wrote:
>>> The diff below adds a gdbarch method to parse auxv entries.  The
>>> parsing code currently lives in the target_ops vector, and this poses
>>> a problem.  Because default_auxv_parse() implements parsing of the
>>> non-standard auxv entries used by Linux (where the type is stored in a
>>> 'long' instead of an 'int'), it doesn't work on 64-bit big-endian
>>> targets that do follow the SVR4 layout.  I've worked around this by
>>> overriding to_auxv_parse in inf_ptrace_trace() for native BSD targets,
>>> but that means the core_ops target is still broken.  And as we build
>>> binaries as PIE by default on OpenBSD now, where auxv parsing is
>>> essential to find out the load address of the executable, reading core
>>> dumps on our 64-bit big-endian platforms is pretty much broken right
>>> now.  And overriding to_auxv_parse for the core_ops target is painful.
>>>
>>> I believe gdbarch is the right place for this functionality.  On many
>>> platforms the memory layout of the entries is consistent across the
>>> various sources.  
>>
>>> But there may be some issues with running 32-bit
>>> binaries on 64-bit systems in relation to things like /proc/*/auxv.  I
>>> suppose to_auxv_parse was designed to solve such problems, although I
>>> fail to see how the current implementation would work in the scenario
>>> of running 32-bit binarie on a platform where /proc/*/auxv provides
>>> the auxv entries in 64-bit format.
>>
>> Yeah.  See the Solaris version:
>>
>> #if defined (PR_MODEL_NATIVE) && (PR_MODEL_NATIVE == PR_MODEL_LP64)
>> /* When GDB is built as 64-bit application on Solaris, the auxv data
>>    is presented in 64-bit format.  We need to provide a custom parser
>>    to handle that.  */
>> static int
>> procfs_auxv_parse (struct target_ops *ops, gdb_byte **readptr,
>> 		   gdb_byte *endptr, CORE_ADDR *typep, CORE_ADDR *valp)
>> {
>>
>>>
>>> Thoughts?  OK?
>>
>> Hmm, thoughts then.
>>
>> gdbarch does seems the right place, in principle.  Though
>> things in practice don't look that simple.
>>
>> Consider for example a 32-bit Solaris gdb, connected to a
>> 64-bit gdbserver.  In that case, the auxv data is presented
>> to gdbserver in a 64-bit format, and then in turn that's what
>> gdbserver sends back to GDB in response to TARGET_OBJECT_AUXV.
>> GDB won't be able to figure out which layout of auxv it's
>> looking at, unless perhaps it looks at the auxv block
>> size (ewwwww), or explicitly asks the target/server which
>> variant it sends.
>>
>> Not sure what the real proper fix for this would be.  Several
>> options I see.  There might be more or even better ones.
>>
>>   #1 - Install a solaris-specific gdbarch parse auxv hook
>>     that has gdb ask the target which variant of auxv is handed
>>     to gdb to work with.
>>   #2 - Hide the fact that the auxv data is presented differently
>>     depending on the bitness of the superior, by making the target
>>     do layout translation when returning the TARGET_OBJECT_AUXV
>>     object (like done with TARGET_OBJECT_SIGNAL_INFO on Linux).
>>   #3 - Hide the fact that the auxv data is presented differently
>>     depending on the bitness of the superior, by making the target
>>     always translate the auxv block to a host and target
>>     independent format that the core consumes (xml?).
>>
>> #2 seems tempting; though so does #3, a little.  Dunno, #1 does
>> too, just a little, perhaps not.
>>
>> And PowerPC has a similar issue:
>>
>>  https://sourceware.org/ml/gdb-patches/2009-01/msg00440.html
>>
>> And that shows that we can't move the auxv parsing to
>> gdbarch by default on Linux either.  At least, not if we don't
>> consult the target before the gdbarch hook.  But then, it
>> sounds like 32-bit gdb against 64-bit gdbserver on ppc might
>> be similarly broken in some scenarios.  #3 above starts
>> sounding a little better than #2.
>>
>> So I swing back -- thought?  :-)
> 
> Hmm, gdbserver would be aware of whether it is running a 32-bit or
> 64-bit binary would it?  

One would hope.

> So having it translate the auxv entries into
> native format should work just fine.

Yes, that's #2 above.  I'm now wondering what does the Solaris kernel
put in the NT_AUXV note in cores generated for 32-bit programs running
on a 64-bit kernel.  One would hope that that would end up with
a 32-bit layout...

GDB's gcore does fill in an NT_AUXV, but it does no layout
translation:

procfs.c:procfs_make_note_section
...
  auxv_len = target_read_alloc (&current_target, TARGET_OBJECT_AUXV,
				NULL, &auxv);
  if (auxv_len > 0)
    {
      note_data = elfcore_write_note (obfd, note_data, note_size,
				      "CORE", NT_AUXV, auxv, auxv_len);
      xfree (auxv);
    }

so with a 64-bit gdb, that'll always end up with 64-bit layout...

If the kernel does 64-bit -> 32-bit translation when generating
cores there (I can't imagine otherwise, but who knows), that too
argues for GDB doing that (i.e., translate at TARGET_OBJECT_AUXV
time) on the Solaris port.

> Given the contraints, we'll probably have to live with with having
> both the target_ops and the gdbarch methods.  I thought about this a
> little bit more yesterday and it seems that we want to move to the
> situation where we try things in the following order:
> 
> 1. The target_ops method if it has been set.
> 
> 2. The gdbarch method if it has been set.
> 
> 3. default_auxv_parse().
> 
> Targets that always see the auxv entries in the "native" format like
> OpenBSD would only set the gdbarch method.  Targets that fetch entries
> from /proc in a format that doesn't necessary match the bitness of the
> binary that's running (Linux, Solaris) would set the target_ops
> method.  

The Linux/PPC issue is not that /proc's format doesn't match
the bitness of the binary.  It's that the auxv needs to be
parsed _before_ the core of GDB even knows the target's
architecture, exactly while trying to figure out the
architecture of the program, as Ulrich said:

> However, there are some scenarios where determining the wordsize
> actually matters (e.g. because the exec file cannot be determined).
> In those cases, the attempts in ppc_linux_read_description would be
> somewhat futile ...

I.e., e.g., no binary at all.   A bit of chicken and egg.
But, that issue can in principle be hidden from core GDB, by
making auxv parsing done by the target backend not go through
GDB core auvx parsing at all.

> Solaris would probably need to set the gdbarch method as well
> to avoid using default_auxv_parse() which doesn't do the right thing
> for SPARC.

> 
> My diff currently has the order of 1 and 2 reversed.  But I think I
> can just switch those around.

I believe so.  I'm trying to figure out whether we actually need
auxv parsing on the target vector at all, and setting up direction
for the future.  Seems very much like we don't, though I'm curious
on the Solaris cores issue.

-- 
Pedro Alves