Bug 28294 - dwarf_aggregate_size fails on some array types
Summary: dwarf_aggregate_size fails on some array types
Status: RESOLVED FIXED
Alias: None
Product: elfutils
Classification: Unclassified
Component: libdw (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Mark Wielaard
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-31 11:51 UTC by Eli Boling
Modified: 2021-10-18 11:43 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2021-09-11 00:00:00


Attachments
Use type of subrange (if any) to determine signedness of upper/lower values (759 bytes, patch)
2021-09-11 23:39 UTC, Mark Wielaard
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eli Boling 2021-08-31 11:51:02 UTC
In dwarf_aggregate_size.c, the helper function array_size unconditionally uses dwarf_formsdata to obtain the value of the DW_AT_upper_bound attribute for array types.  In many cases, this will return a negative value for C arrays that have positive upper bounds, causing the function to return a failure value, which propagates up through dwarf_aggregate_size.

This is an exemplary type (via readelf -w):
 <1><90e>: Abbrev Number: 37 (DW_TAG_array_type)
    <90f>   DW_AT_type        : <0x118>
 <2><913>: Abbrev Number: 11 (DW_TAG_subrange_type)
    <914>   DW_AT_type        : <0x2c>
    <918>   DW_AT_upper_bound : 249

And the same type, via eu-readelf --debug-dump=info:
 [   90e]    array_type           abbrev: 37
             type                 (ref4) [   118]
 [   913]      subrange_type        abbrev: 11
               type                 (ref4) [    2c]
               upper_bound          (data1) 249

If dwarf_aggregate_size is called on this type, when it gets the upper_bound attribute, it will get a value of -7, and fail.  For other array sizes, this will work.

Looking around a bit, the closest discussion I could find on the topic was this one about signed vs unsigned interpretation of array bounds back in 2005:
http://www.dwarfstd.org/ShowIssue.php?issue=020702.1

I exchanged emails with Mark Wielaard on this, and he indicated that this did appear to be a bug, but he wasn't sure yet where the correct fix would be.

I've tried this with a RISCV compiler (version 8.3.0), an ARM compiler (version 7.3.1) and an x86 gcc (version 7.5.0).  I've not tried it with later versions.

Here's the sample code I compiled to get the output above.  The output in the report is from the ARM compiler.

#include <string.h>
int foofunc(int v, char *s) {
  char buff[250];
  strcpy(buff, s);
  return buff[v];
  }

int main() {
  return foofunc(4, "fdjkfd");
}
Comment 1 Mark Wielaard 2021-09-11 23:39:12 UTC
Created attachment 13662 [details]
Use type of subrange (if any) to determine signedness of upper/lower values

Check if the subrange has an associate type, if it does then check the type to determine whether the upper and lower values need to be interpreted as signed of unsigned values. We default to signed because that is what run-aggregate-size.sh testfile-size4 expects (but it is an hardwritten testcase, we can flip the default if that makes more sense).
Comment 2 Mark Wielaard 2021-09-11 23:43:13 UTC
(In reply to Eli Boling from comment #0)
> In dwarf_aggregate_size.c, the helper function array_size unconditionally
> uses dwarf_formsdata to obtain the value of the DW_AT_upper_bound attribute
> for array types.  In many cases, this will return a negative value for C
> arrays that have positive upper bounds, causing the function to return a
> failure value, which propagates up through dwarf_aggregate_size.
> 
> This is an exemplary type (via readelf -w):
>  <1><90e>: Abbrev Number: 37 (DW_TAG_array_type)
>     <90f>   DW_AT_type        : <0x118>
>  <2><913>: Abbrev Number: 11 (DW_TAG_subrange_type)
>     <914>   DW_AT_type        : <0x2c>
>     <918>   DW_AT_upper_bound : 249
> 
> And the same type, via eu-readelf --debug-dump=info:
>  [   90e]    array_type           abbrev: 37
>              type                 (ref4) [   118]
>  [   913]      subrange_type        abbrev: 11
>                type                 (ref4) [    2c]
>                upper_bound          (data1) 249
> 
> If dwarf_aggregate_size is called on this type, when it gets the upper_bound
> attribute, it will get a value of -7, and fail.  For other array sizes, this
> will work.
> 
> Looking around a bit, the closest discussion I could find on the topic was
> this one about signed vs unsigned interpretation of array bounds back in
> 2005:
> http://www.dwarfstd.org/ShowIssue.php?issue=020702.1
> 
> I exchanged emails with Mark Wielaard on this, and he indicated that this
> did appear to be a bug, but he wasn't sure yet where the correct fix would
> be.

Could you try the attached patch?
I don't know if it works, it depends on the subrange_type at [2c].
If it doesn't work, could you post the full debug-dump or attach a test binary?
Comment 3 Mark Wielaard 2021-10-06 20:42:20 UTC
Patch posted:
https://sourceware.org/pipermail/elfutils-devel/2021q4/004248.html
Comment 4 Mark Wielaard 2021-10-18 11:39:18 UTC
Pushed as:

commit c3a6a9dfc6ed0c24ab2d11b2d71f425b479575c9
Author: Mark Wielaard <mark@klomp.org>
Date:   Wed Oct 6 22:41:29 2021 +0200

    libdw: Use signedness of subrange type to determine array bounds
    
    When calculating the array size check if the subrange has an associate
    type, if it does then check the type to determine whether the upper
    and lower values need to be interpreted as signed of unsigned
    values. We default to signed because that is what the testcase
    run-aggregate-size.sh testfile-size4 expects (this is an hardwritten
    testcase, we could have chosen a different default).
    
    https://sourceware.org/bugzilla/show_bug.cgi?id=28294
    
    Signed-off-by: Mark Wielaard <mark@klomp.org>

Please reopen or file a new bug if this didn't fully resolve your issue.
Comment 5 Mark Wielaard 2021-10-18 11:43:07 UTC
Patch pushed