22546 – dwarf_aggregate_size() doesn't work for multi-dimensional arrays

Bug 22546 - dwarf_aggregate_size() doesn't work for multi-dimensional arrays

Summary: dwarf_aggregate_size() doesn't work for multi-dimensional arrays

Status:	RESOLVED FIXED

Alias:	None

Product:	elfutils
Classification:	Unclassified
Component:	libdw (show other bugs)
Version:	unspecified

Importance:	P2 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2017-12-04 23:05 UTC by dima kogan
Modified:	2017-12-12 09:52 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
Prototype fix (642 bytes, patch) 2017-12-04 23:06 UTC, dima kogan	Details \| Diff
Demo program (1.19 KB, text/x-csrc) 2017-12-04 23:18 UTC, dima kogan	Details
Update to the test suite to show this problem (3.13 KB, patch) 2017-12-08 09:47 UTC, dima kogan	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description dima kogan 2017-12-04 23:05:39 UTC

Hi. I'm observing that dwarf_aggregate_size() returns bogus results when looking at double-dimensional arrays. For instance, looking at

  double dd[3][5];

It says the aggregate is 64-bytes long instead of 120. The bug is that it ends up computing (3+5)*8 instead of 3*5*8.

I'm attaching a simple test case. It loads the current process's debug information, and prints out the size of dd, defined as above. It shows the failure:

  dima@fatty:/tmp$ gcc -g -ldw -lelf dwarf_aggregate_size_bug.c -o dwarf_aggregate_size_bug

  dima@fatty:/tmp$ ./dwarf_aggregate_size_bug                                              
  Found DIE for 'dd'. size: 64

I'm using libelf1 0.168-1 on Debian/sid, although at least version 0.170 has the bug too.

I'm attaching a proof-of-concept patch to solve this. Note that this patch isn't final: previously we would compute a separate stride for each dimension, while the patch only computes the stride once. I don't know what case specifically that is meant to handle. Tests pass before and after.

If for some reason dwarf_aggregate_size() isn't meant to be used this way, then it should still be changed to fail in a more obvious way. Answering 64 here is NEVER the right answer.

Comment 1 dima kogan 2017-12-04 23:06:38 UTC

Created attachment 10662 [details]
Prototype fix

Comment 2 dima kogan 2017-12-04 23:18:50 UTC

Created attachment 10663 [details]
Demo program

Comment 3 dima kogan 2017-12-08 09:47:17 UTC

Created attachment 10672 [details]
Update to the test suite to show this problem

Comment 4 dima kogan 2017-12-08 09:49:02 UTC

Here's a patch to add the failing case to the test suite. This test update fails in the stock sources, but succeeds with my patch applied. Note that this patch contains a diff to a binary file (that's how the test suite works), and this binary piece will be recognized by 'git am' only.

Comment 5 Mark Wielaard 2017-12-11 16:31:06 UTC

Thanks I think you are right. I am looking at how to correctly handle the stride. I believe this only happens with Fortran code.

While reviewing the code I noticed something else that looks wrong.
Independent from what you discovered.

We do the following:

dwarf_aggregate_size (Dwarf_Die *die, Dwarf_Word *size)
{
  Dwarf_Die type_mem;

  if (INTUSE (dwarf_peel_type) (die, die) != 0)
    return -1;

  return aggregate_size (die, size, &type_mem);
}

The caller probably doesn't really care, but it isn't really nice to change the die the caller gave us. So I propose to change it like so:

int
dwarf_aggregate_size (Dwarf_Die *die, Dwarf_Word *size)
{
  Dwarf_Die die_mem, type_mem;

  if (INTUSE (dwarf_peel_type) (die, &die_mem) != 0)
    return -1;

  return aggregate_size (&die_mem, size, &type_mem);
}

Just wanted to note that before I forgot investigating the real issue.

Comment 6 Mark Wielaard 2017-12-11 22:54:55 UTC

I believe your fix is correct. And we really do want to calculate the stride only once based on the array element type, not for each dimension. I am not sure what the original code tried to do. It probably just was never tested against multi-dimensional arrays. So the testcase addition is great too.

Would you mind sending a complete patch plus the testcase addition to the mailinglist as describe in the CONTRIBUTING document?
https://sourceware.org/git/?p=elfutils.git;a=blob_plain;f=CONTRIBUTING;hb=HEAD

Comment 7 dima kogan 2017-12-12 08:10:37 UTC

Thanks for looking at it. I submitted the patch.

Comment 8 Mark Wielaard 2017-12-12 09:52:49 UTC

Fix has been pushed to master:
https://sourceware.org/ml/elfutils-devel/2017-q4/msg00103.html

A patch for the separate issue mentioned in comment #5 has been submitted:
https://sourceware.org/ml/elfutils-devel/2017-q4/msg00106.html