Hi. I'm observing that dwarf_aggregate_size() returns bogus results when looking at double-dimensional arrays. For instance, looking at double dd[3][5]; It says the aggregate is 64-bytes long instead of 120. The bug is that it ends up computing (3+5)*8 instead of 3*5*8. I'm attaching a simple test case. It loads the current process's debug information, and prints out the size of dd, defined as above. It shows the failure: dima@fatty:/tmp$ gcc -g -ldw -lelf dwarf_aggregate_size_bug.c -o dwarf_aggregate_size_bug dima@fatty:/tmp$ ./dwarf_aggregate_size_bug Found DIE for 'dd'. size: 64 I'm using libelf1 0.168-1 on Debian/sid, although at least version 0.170 has the bug too. I'm attaching a proof-of-concept patch to solve this. Note that this patch isn't final: previously we would compute a separate stride for each dimension, while the patch only computes the stride once. I don't know what case specifically that is meant to handle. Tests pass before and after. If for some reason dwarf_aggregate_size() isn't meant to be used this way, then it should still be changed to fail in a more obvious way. Answering 64 here is NEVER the right answer.
Created attachment 10662 [details] Prototype fix
Created attachment 10663 [details] Demo program
Created attachment 10672 [details] Update to the test suite to show this problem
Here's a patch to add the failing case to the test suite. This test update fails in the stock sources, but succeeds with my patch applied. Note that this patch contains a diff to a binary file (that's how the test suite works), and this binary piece will be recognized by 'git am' only.
Thanks I think you are right. I am looking at how to correctly handle the stride. I believe this only happens with Fortran code. While reviewing the code I noticed something else that looks wrong. Independent from what you discovered. We do the following: dwarf_aggregate_size (Dwarf_Die *die, Dwarf_Word *size) { Dwarf_Die type_mem; if (INTUSE (dwarf_peel_type) (die, die) != 0) return -1; return aggregate_size (die, size, &type_mem); } The caller probably doesn't really care, but it isn't really nice to change the die the caller gave us. So I propose to change it like so: int dwarf_aggregate_size (Dwarf_Die *die, Dwarf_Word *size) { Dwarf_Die die_mem, type_mem; if (INTUSE (dwarf_peel_type) (die, &die_mem) != 0) return -1; return aggregate_size (&die_mem, size, &type_mem); } Just wanted to note that before I forgot investigating the real issue.
I believe your fix is correct. And we really do want to calculate the stride only once based on the array element type, not for each dimension. I am not sure what the original code tried to do. It probably just was never tested against multi-dimensional arrays. So the testcase addition is great too. Would you mind sending a complete patch plus the testcase addition to the mailinglist as describe in the CONTRIBUTING document? https://sourceware.org/git/?p=elfutils.git;a=blob_plain;f=CONTRIBUTING;hb=HEAD
Thanks for looking at it. I submitted the patch.
Fix has been pushed to master: https://sourceware.org/ml/elfutils-devel/2017-q4/msg00103.html A patch for the separate issue mentioned in comment #5 has been submitted: https://sourceware.org/ml/elfutils-devel/2017-q4/msg00106.html