This is the mail archive of the
mailing list for the GDB project.
Re: [RFC 2/3] Record function descriptor address instead of function address in value
- From: "Ulrich Weigand" <uweigand at de dot ibm dot com>
- To: qiyaoltc at gmail dot com (Yao Qi)
- Cc: gdb-patches at sourceware dot org
- Date: Mon, 17 Oct 2016 17:51:33 +0200 (CEST)
- Subject: Re: [RFC 2/3] Record function descriptor address instead of function address in value
- Authentication-results: sourceware.org; auth=none
Yao Qi wrote:
> GCC vs GDB divergence on the meaning of "incr" brings the confusion.
> We should reduce such divergence as much as we can. However, this
> divergence was added in https://sourceware.org/ml/gdb-patches/2001-11/msg00001.html
> I agree with Jim, but I'd like use function descriptor address in value,
> which makes the whole expression evaluation look more reasonable and
> more close to compiler's behavior.
I agree that this is a problem. However, the reasons Jim mentions in
that long comment are still valid: it is in general *not possible* to
convert an arbitrary function address back to a function descriptor ...
> In this patch, I add a new gdbarch method convert_from_func_addr, which
> converts function address back to function descriptor address or
> function pointer address. It is the reversed operation of
... and therefore this function cannot really be implemented on ppc64
in a fully-general way. In particular, when running stripped binaries
or code that is otherwise without symbols (e.g. JITted code etc.), this
conversion may not be possible.
B.t.w. note that there is already a similar function that attempts this
conversion (ppc-sysv-tdep.c:convert_code_addr_to_desc_addr), but it is
currently only used during inferior function calls, so it is not so
critical if it fails. (With your change, that function may actually
no longer be necessary at that place since values should already point
to function descriptors ...)
Note that this checks for presence of an msymbol at the code address,
while your code checks for presence of a symbol. Maybe the best way
would be to check for either.
> We convert function address to function
> descriptor address when,
> - we create value for a function,
> - we generate ax code for a function,
Since the conversion is problematic in general, I think the best way
would be to keep descriptor addresses wherever possible, and only
convert to code addresses at the last minute when necessary.
Your code already *mostly* does that, with the exception of symbol
addresses. I'm wondering whether we shouldn't push the change one
step further back and already store descriptor addresses in
SYMBOL_VALUE_ADDRESS. Note that this is currently already handled
somewhat inconsistenly on ppc64: msymbols already point to the
descriptor, and so do symbols read from stabs debug info; only
symbols read from DWARF debug info actually point to the code
> I don't change the meaning of return value of value_as_address, because
> it is widely used, so value_as_address still return the function
> address if type is TYPE_CODE_FUNC or TYPE_CODE_METHOD.
I'm wondering whether this is a good idea; maybe it would be better to
return descriptor addresses and update those callers that actually need
code addresses. This would keep the principle that we should keep
descriptor addresses as long as possible. Also, this might actually
fix more bugs; if you look at e.g. this code in value_equal:
else if (code1 == TYPE_CODE_PTR && is_int2)
return value_as_address (arg1) == (CORE_ADDR) value_as_long (arg2);
else if (code2 == TYPE_CODE_PTR && is_int1)
return (CORE_ADDR) value_as_long (arg1) == value_as_address (arg2);
this might actually cause invalid evaluation of expressions like
incr == 0x12345678
(as compared to the native C evaluation).
> This patch brings several user visible changes, which look more
> accurate, shown by this table below,
> COMMAND BEFORE AFTER
> p main main function address main function descriptor
> disass main disassembly function main not changed
> disass main+4,+4 disassembly 4 bytes from disassembly 4 bytes from
> function main address + 4 main's function descriptor + 4
> x/i main show one instruction on show one instruction on main's
> function main address function descriptor
> Although the latter looks inconvenient, that is consistent to the
> meaning on C language level. Due to these changes, test cases are
> adjusted accordingly.
Those are a bit annoying. I think the underlying problem is that operations
like "disass" or "x/i" really don't work on "values" in the original sense
but rather on "PC addresses". Maybe it would make sense to have those
function enable a special "mode" in the expression evaluator that would
change the conversion of functions to function pointers to use the code
address instead of the descriptor address?
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain