[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[committed] Avoid htab_expand for off_htab
Hi,
Every time the number of elements in a hash table from hashtab.c grow beyond
75% occupancy, the hash table grows in size.
Using a debug patch to print this growth for off_htab, we get:
...
$ dwz cc1 -o 1 -lnone
Allocating htab with size 131071
Growing htab from size 131071 to 524287
Growing htab from size 524287 to 2097143
Growing htab from size 2097143 to 4194301
Growing htab from size 4194301 to 16777213
$ dwz vmlinux -o 1 -lnone
Allocating htab with size 131071
Growing htab from size 131071 to 524287
Growing htab from size 524287 to 2097143
Growing htab from size 2097143 to 4194301
Growing htab from size 4194301 to 16777213
Growing htab from size 16777213 to 67108859
...
There is a cost associated to this growing.
Every time the hash table grows:
- a new array is allocated, after which the old array is is freed. This can
cause memory fragmentation.
- the elements need to be copied from the old to the new array.
Also, hash table collisions will be higher due to operating on a hash table
with on average higher occupancy.
In the case of off_htab, we can estimate the final size and use that as
initial size, which skips the growth costs.
I've tested this on cc1 and vmlinux, and the gain is ~6%:
...
cc1:
real: mean: 6985.50 100.00% stddev: 34.55
mean: 6531.40 93.50% stddev: 18.82
user: mean: 6816.70 100.00% stddev: 30.84
mean: 6374.00 93.51% stddev: 28.43
sys: mean: 167.20 100.00% stddev: 17.77
mean: 156.80 93.78% stddev: 17.05
vmlinux:
real: mean: 30748.60 100.00% stddev: 377.31
mean: 28773.90 93.58% stddev: 61.33
user: mean: 30338.40 100.00% stddev: 139.17
mean: 28499.70 93.94% stddev: 49.44
sys: mean: 336.50 100.00% stddev: 102.51
mean: 272.00 80.83% stddev: 23.55
...
As for the collisions (measured using another debug patch), there's a clear
reduction in collision rate for both cc1 (-0.13):
...
$ dwz cc1 -o 1 -lnone 2>&1 | sort -V | tail -n 1
-searches: 34256212, collisions: 0.684419, occupancy: 0.607308, line: 11031
+searches: 34256212, collisions: 0.551075, occupancy: 0.607308, line: 11086
...
as well as vmlinux (-0.57):
...
$ dwz vmlinux -o 1 -lnone 2>&1 | sort -V | tail -n 1
-searches: 83198411, collisions: 0.672008, occupancy: 0.260028, line: 11031
+searches: 83198411, collisions: 0.100454, occupancy: 0.260028, line: 11086
...
To confirm, I've tried a yet larger example, clang (with cc1 at 10.2 million
DIEs, vmlinux at 17.5 million DIES, and clang at 111.4 million DIEs):
...
real: mean: 127261.67 100.00% stddev: 422.04
mean: 112483.33 88.39% stddev: 443.20
user: mean: 116865.67 100.00% stddev: 101.59
mean: 102581.33 87.78% stddev: 344.42
sys: mean: 5881.33 100.00% stddev: 261.95
mean: 5218.00 88.72% stddev: 144.81
...
Here we see an even higher performance gain, of ~11%.
We don't do this optimization when operating in low-mem or optimize_multifile
mode. In either mode, not all DIEs will be present in off_htab after parsing
the entire .debug_info, so it's hard to estimate what the final size will be.
Also, we don't do this optimization if we're suspecting that we'll run into
the low-mem limit (again using the estimated number of DIEs). There's no
need to allocate that much memory if we don't think we'll end up using all
of it.
Likewise, we don't do this optimization if we're suspecting that we'll run into
the max-die limit (again using the estimated number of DIEs).
Committed to trunk.
Thanks,
- Tom
Avoid htab_expand for off_htab
2019-11-20 Tom de Vries <tdevries@suse.de>
* dwz.c (off_htab_add_die): Allocate off_htab with estimated final
size.
---
dwz.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/dwz.c b/dwz.c
index db134ab..84a7080 100644
--- a/dwz.c
+++ b/dwz.c
@@ -1279,7 +1279,21 @@ off_htab_add_die (dw_cu_ref cu, dw_die_ref die)
if (off_htab == NULL)
{
- off_htab = htab_try_create (100000, off_hash, off_eq, NULL);
+ unsigned int estimated_nr_dies = estimate_nr_dies ();
+ size_t default_initial_size = 100000;
+ size_t initial_size;
+ if (low_mem
+ || op_multifile
+ || estimated_nr_dies >= low_mem_die_limit
+ || estimated_nr_dies >= max_die_limit)
+ initial_size = default_initial_size;
+ else
+ {
+ size_t estimated_final_hashtab_size
+ = emulate_htab (default_initial_size, estimated_nr_dies);
+ initial_size = estimated_final_hashtab_size;
+ }
+ off_htab = htab_try_create (initial_size, off_hash, off_eq, NULL);
if (off_htab == NULL)
dwz_oom ();
if (rd_multifile)