Bug 22546 - dwarf_aggregate_size() doesn't work for multi-dimensional arrays
Summary: dwarf_aggregate_size() doesn't work for multi-dimensional arrays
Status: RESOLVED FIXED
Alias: None
Product: elfutils
Classification: Unclassified
Component: libdw (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-04 23:05 UTC by dima kogan
Modified: 2017-12-12 09:52 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Prototype fix (642 bytes, patch)
2017-12-04 23:06 UTC, dima kogan
Details | Diff
Demo program (1.19 KB, text/x-csrc)
2017-12-04 23:18 UTC, dima kogan
Details
Update to the test suite to show this problem (3.13 KB, patch)
2017-12-08 09:47 UTC, dima kogan
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description dima kogan 2017-12-04 23:05:39 UTC
Hi. I'm observing that dwarf_aggregate_size() returns bogus results when looking at double-dimensional arrays. For instance, looking at

  double dd[3][5];

It says the aggregate is 64-bytes long instead of 120. The bug is that it ends up computing (3+5)*8 instead of 3*5*8.

I'm attaching a simple test case. It loads the current process's debug information, and prints out the size of dd, defined as above. It shows the failure:

  dima@fatty:/tmp$ gcc -g -ldw -lelf dwarf_aggregate_size_bug.c -o dwarf_aggregate_size_bug

  dima@fatty:/tmp$ ./dwarf_aggregate_size_bug                                              
  Found DIE for 'dd'. size: 64

I'm using libelf1 0.168-1 on Debian/sid, although at least version 0.170 has the bug too.

I'm attaching a proof-of-concept patch to solve this. Note that this patch isn't final: previously we would compute a separate stride for each dimension, while the patch only computes the stride once. I don't know what case specifically that is meant to handle. Tests pass before and after.

If for some reason dwarf_aggregate_size() isn't meant to be used this way, then it should still be changed to fail in a more obvious way. Answering 64 here is NEVER the right answer.
Comment 1 dima kogan 2017-12-04 23:06:38 UTC
Created attachment 10662 [details]
Prototype fix
Comment 2 dima kogan 2017-12-04 23:18:50 UTC
Created attachment 10663 [details]
Demo program
Comment 3 dima kogan 2017-12-08 09:47:17 UTC
Created attachment 10672 [details]
Update to the test suite to show this problem
Comment 4 dima kogan 2017-12-08 09:49:02 UTC
Here's a patch to add the failing case to the test suite. This test update fails in the stock sources, but succeeds with my patch applied. Note that this patch contains a diff to a binary file (that's how the test suite works), and this binary piece will be recognized by 'git am' only.
Comment 5 Mark Wielaard 2017-12-11 16:31:06 UTC
Thanks I think you are right. I am looking at how to correctly handle the stride. I believe this only happens with Fortran code.

While reviewing the code I noticed something else that looks wrong.
Independent from what you discovered.

We do the following:

dwarf_aggregate_size (Dwarf_Die *die, Dwarf_Word *size)
{
  Dwarf_Die type_mem;

  if (INTUSE (dwarf_peel_type) (die, die) != 0)
    return -1;

  return aggregate_size (die, size, &type_mem);
}

The caller probably doesn't really care, but it isn't really nice to change the die the caller gave us. So I propose to change it like so:

int
dwarf_aggregate_size (Dwarf_Die *die, Dwarf_Word *size)
{
  Dwarf_Die die_mem, type_mem;

  if (INTUSE (dwarf_peel_type) (die, &die_mem) != 0)
    return -1;

  return aggregate_size (&die_mem, size, &type_mem);
}

Just wanted to note that before I forgot investigating the real issue.
Comment 6 Mark Wielaard 2017-12-11 22:54:55 UTC
I believe your fix is correct. And we really do want to calculate the stride only once based on the array element type, not for each dimension. I am not sure what the original code tried to do. It probably just was never tested against multi-dimensional arrays. So the testcase addition is great too.

Would you mind sending a complete patch plus the testcase addition to the mailinglist as describe in the CONTRIBUTING document?
https://sourceware.org/git/?p=elfutils.git;a=blob_plain;f=CONTRIBUTING;hb=HEAD
Comment 7 dima kogan 2017-12-12 08:10:37 UTC
Thanks for looking at it. I submitted the patch.
Comment 8 Mark Wielaard 2017-12-12 09:52:49 UTC
Fix has been pushed to master:
https://sourceware.org/ml/elfutils-devel/2017-q4/msg00103.html

A patch for the separate issue mentioned in comment #5 has been submitted:
https://sourceware.org/ml/elfutils-devel/2017-q4/msg00106.html