Bug 24274

Summary: low-mem files processed in multifile mode
Product: dwz Reporter: Tom de Vries <vries>
Component: defaultAssignee: Nobody <nobody>
Status: NEW ---    
Severity: enhancement CC: dwz, jakub, mark
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Tom de Vries 2019-02-27 09:09:32 UTC
Consider four executables, a.out and b.out a hello world example, and c.out and d.out copied from the dwz executable itself:
...
$ gcc hello.c -g ; cp a.out b.out
$ cp dwz c.out ; cp c.out d.out
...

This gives us two executables with 130 DIEs, and two executables with 5356 DIEs:
...
$ readelf -w a.out | grep '(DW_TAG' | wc -l
130
$ readelf -w b.out | grep '(DW_TAG' | wc -l
130
$ readelf -w c.out | grep '(DW_TAG' | wc -l
5356
$ readelf -w d.out | grep '(DW_TAG' | wc -l
5356
...

Now consider a gdb script that traces the dwz invocations:
...
$ cat gdb.script
b dwz
commands
continue
end
run
...

We run in multifile mode, with a low-mem limit of 1000 dies, and trace into LOG:
...
$ gdb \
    -batch \
    -x gdb.script \
    --args dwz -m3 -l1000 a.out b.out c.out d.out \
    > LOG 2>&1
...

which we then summarize as follows:
...
$ grep 'dwz (' LOG | awk '{print $5}'
"a.out",
"b.out",
"c.out",
"c.out",
"d.out",
"d.out",
"a.out",
"b.out",
"c.out",
"d.out",
...

The first 6 invocations are according to plan:
- a.out and b.out have fewer DIEs than 1000, and are processed.
- c.out and d.out have more DIEs than 1000, and are each processed twice:
  - once in regular mode (where dwz stops at a 1000 processed DIEs and returns
    2), and 
  - once in low-mem mode.

However, then all 4 files are once more processed in fi_multifile mode, while the intention is that c.out and d.out (being bigger than the low-mem limit) are not processed anymore.
Comment 1 Tom de Vries 2019-02-27 09:16:49 UTC
AFAIU, there is code intended to prevent this scenario from happening at the end of dwz:
...
  free (dso);
  if (ret == 0 && !low_mem)
    res->res = 0;
  return ret;
...

But the low_mem condition is never true here, because cleanup is run before arriving there, which sets multifile_mode to 0.
Comment 2 Tom de Vries 2019-03-07 06:19:35 UTC
posted patch: https://sourceware.org/ml/dwz/2019-q1/msg00058.html
Comment 3 Tom de Vries 2019-11-28 14:29:55 UTC
Reclassifying as enhancement.
Comment 4 Mark Wielaard 2021-02-19 00:08:42 UTC
Has an unreviewed patch, see Comment #2