This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Remove gdb_assert in dw2_find_pc_sect_compunit_symtab.


I've looked into the problem more deeply and it appears that the linker
(lld in our case) produces a corrupt (at least inconsistent) gdb_index. I
think gdb should not be expected to do a good job under the circumstances,
(also removing the assert does not produce a flawless backtrace anyway), so
I will not pursue this patch. Thanks for the review.

On Thu, Dec 19, 2019 at 8:21 PM Ali Tamur <tamur@google.com> wrote:

> Hi Tom,
>
> Thank you for your reply. Sure, I don't want to cover gdb bugs and I want
> to get to the bottom of this.
> In dw2_find_pc_sect_compunit_symtab, the line:
>
>   data = (struct dwarf2_per_cu_data *) addrmap_find    (objfile->partial_symtabs->psymtabs_addrmap, pc - baseaddr);assigns a wrong cu, so recursively_find_pc_sect_compunit_symtab cannot find the address and fails.
>
> The root cause is that either create_addrmap_from_index does a poor job constructing the index,or the .gdb_index section of the binary is hopelessly corrupted, and I don't know which. So, maybeyou can help.
> The Address area offset part of the index looks suspicious to me:1. There are pairs like [0x100, 0x101) which are empty.My teammates say this might be caused by unused & removed functions, but we are not sure.
> 2. There are address pairs like [0x100, 0x200) that are assigned to multiple cu's.Teammates say this might be caused by inlined functions, or icf=safe etc. build options.
> 3. There are address pairs like [0x100, 0x200) ==> cu_A, where I cannot find such an addressrange in the related  DW_TAG_compile_unit. But I need to verify this more carefully.
> 3. There are overlapping address pairs like [0x100, 0x400) ==> cu_A, [0x200, 0x300) ==>cu_B.Teammates have a harder time explaining this.
> I look at the way how addrmap is constructed; there is this splay tree underneath, and it looks likegdb does not assume overlapping address pairs. My current hypothesis is that, a sequence like:[0x300, 0x400)  ==> cu_A
> [0x100, 0x500)  ==> cu_A
> [0x200, 0x600)  ==> cu_B
> results in something like:0x100(A) -- 0x200(B) -- 0x600(end) so 0x301 is assigned to cu_B instead of cu_A.
> I am sorry that I cannot reduce the problem case and cannot share the binary. (I cannot even compileand reproduce the binary, so I am somewhat in the dark too). So, what do you think? Does the gdb_indexlook well formed to you? If not, is it reasonable for gdb to still try its best under the circumstances? If gdb isto handle this, it seems like addrmap implementation needs a non-trivial upgrade so that every interval canmap to multiple cu's. (Also, splay tree seems like not the most efficient data structure for the problem, sinceso much code looks for predecessor and successor nodes, but that may be my ignorance). I can invest sometime to write a new implementation, but I don't even know whether it is a good idea to support multiplemappings in the first place?
> Thank you.Ali
>
>
> On Thu, Dec 19, 2019 at 10:10 AM Tom Tromey <tom@tromey.com> wrote:
>
>> >>>>> "Ali" == Ali Tamur via gdb-patches <gdb-patches@sourceware.org>
>> writes:
>>
>> Ali> All callers (such as find_pc_sect_compunit_symtab in symtab.c) check
>> Ali> the return value and take meaningful action if it is null.
>>
>> Ali> The problem with the gdb_assert is that, although the function
>> should never
>> Ali> return null with well formed debug info, it does when for example
>> .gdb_index
>> Ali> section of the object file is faulty. I am debugging a rare and not
>> easy to
>> Ali> reproduce crash in our code base, and this assertion prevents the
>> user seeing
>> Ali> the full stack trace. When this assertion is removed, gdb falls back
>> to not
>> Ali> using gdb_index information and prints a useful stack trace. Apart
>> from the
>> Ali> separate issue of finding out why gdb_index is corrupted, I am
>> guessing that
>> Ali> gdb should be resilient when presented with bad debug info.
>>
>> I think the idea of this assert is that
>> recursively_find_pc_sect_compunit_symtab
>> "shouldn't" be able to fail.
>>
>> dw2_find_pc_sect_compunit_symtab first checks the addrmap, and if the
>> address
>> isn't found there, it returns early.
>>
>> But, if the address is in the addrmpa, then it should be covered by some
>> symtab.
>> If this doesn't happen, it means there is a disagreement between the
>> psymtab
>> reader and the symtab reader -- in other words, a bug elsewhere in
>> dwarf2read.c.
>>
>> So, I think it would be better to track down this other bug.
>>
>> What do you think of that?
>>
>> thanks,
>> Tom
>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]