Hi,
I've been working on performance improvements for dwz, using a cc1
binary as my optimization vehicle.
Comparing the situation:
- before (commit 04a676d Add --devel-partition-dups-opt), and
- after (current master, commit e405c62 Add --devel-die-count-method
{none,estimate})
I get the following results.
When avoiding running into the low-mem die-limit using -lnone, we get
~25% performance improvement, due to an improved hash function and an
improved hash table allocation strategy (without increasing peak memory
usage):
...
real: mean: 7378.10 100.00% stddev: 45.31
mean: 5558.80 75.34% stddev: 35.18
user: mean: 7106.30 100.00% stddev: 41.53
mean: 5328.10 74.98% stddev: 22.33
sys: mean: 271.60 100.00% stddev: 39.57
mean: 230.00 84.68% stddev: 40.45
...
And if we don't avoid running into the low-mem die-limit, we get ~38%
performance improvement:
...
real: mean: 15084.80 100.00% stddev: 44.53
mean: 9232.90 61.21% stddev: 41.80
user: mean: 14759.40 100.00% stddev: 30.62
mean: 9100.10 61.66% stddev: 41.75
sys: mean: 324.00 100.00% stddev: 39.51
mean: 132.00 40.74% stddev: 27.26
...
which is also paired with a reduction in peak memory usage of ~34%, from
0.95GB to 0.63GB, due to running into the low-mem die-limit in a more
efficient manner.