Bug 25231

Summary: Reuse checksums
Product: dwz Reporter: Tom de Vries <vries>
Component: defaultAssignee: Nobody <nobody>
Status: NEW ---    
Severity: enhancement CC: dwz
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Tom de Vries 2019-11-29 09:22:34 UTC
For a dwz invocation for files 1 and 2 with multifile 3:
...
$ dwz -m 3 1 2
...
dwz goes through the following phases:
- regular-mode 1
- write-multifile 1
- regular-mode 2
- write-multifile 2
- optimize-multifile
- read-multifile
- finalize-multifile 1
- finalize-multifile 2

It would be nice if we could speed up things in f.i. finalize-multifile mode by reusing things done in regular-mode.

Let's focus for the moment on the handling of long unsigned int and size_t (which is a typedef of long unsigned int):
...
 <1><2d>: Abbrev Number: 2 (DW_TAG_typedef)
    <2e>   DW_AT_name        : (indirect string, offset: 0x38): size_t
    <32>   DW_AT_decl_file   : 2
    <33>   DW_AT_decl_line   : 216
    <34>   DW_AT_type        : <0x38>
 <1><38>: Abbrev Number: 3 (DW_TAG_base_type)
    <39>   DW_AT_byte_size   : 8
    <3a>   DW_AT_encoding    : 7        (unsigned)
    <3b>   DW_AT_name        : (indirect string, offset: 0x1a0): long unsigned int
...

If we look at the checksums in the various phases, we get:
...
$ cp hello 1
$ cp 1 2
$ dwz -m 3 1 2 --devel-dump-dies --devel-trace 2>&1 \
  | egrep 'multifile|size_t|long unsigned int'
  2d O 1b0ea5ae b19e6191 size_t
  38 O 7c5b2022 7c5b2022 long unsigned int
Write-multifile 1
  2d O 1b0ea5ae b19e6191 size_t
  38 O 7c5b2022 7c5b2022 long unsigned int
Write-multifile 2
Optimize-multifile
  14 O ab012a69 bf55db67 size_t
  1c O 4bb43633 4bb43633 long unsigned int
  267 O ab012a69 bf55db67 size_t
  26f O 4bb43633 4bb43633 long unsigned int
Read-multifile
  14 O ab012a69 bf55db67 size_t
  1c O 4bb43633 4bb43633 long unsigned int
Compressing 1 in finalize-multifile mode
  26 O ab012a69 bf55db67 size_t
  2e O 4bb43633 4bb43633 long unsigned int
Compressing 2 in finalize-multifile mode
  26 O ab012a69 bf55db67 size_t
  2e O 4bb43633 4bb43633 long unsigned int
...
we can see that the checksum for long unsigned int is different in regular-mode:
...
  38 O 7c5b2022 7c5b2022 long unsigned int
...
and finalize-multifile mode:
...
  2e O 4bb43633 4bb43633 long unsigned int
...

That difference is caused by handling of DW_FORM_strp, which is encoded into the checksum using the index to the string table rather than the string contents, which is a speed optimization for regular-mode (but which assumes that the input file is optimally encoded, in the sense that each unique string is encoded either using DW_FORM_strp, or DW_FORM_string, but not both).

If we force dwz to pick up the string contents for DW_FORM_strp using this patch:
...
diff --git a/dwz.c b/dwz.c
index 3c886d6..9ab3e33 100644
--- a/dwz.c
+++ b/dwz.c
@@ -2623,8 +2623,7 @@ checksum_die (DSO *dso, dw_cu_ref cu, dw_die_ref top_die, dw_die_ref die)
            }
          break;
        case DW_FORM_strp:
-         if (unlikely (op_multifile || rd_multifile || fi_multifile)
-             && die->die_ck_state != CK_BAD)
+         if (die->die_ck_state != CK_BAD)
            {
              value = read_32 (ptr);
              if (value >= debug_sections[DEBUG_STR].size)
...
we get:
...
  2d O ab012a69 bf55db67 size_t
  38 O 4bb43633 4bb43633 long unsigned int
Write-multifile 1
  2d O ab012a69 bf55db67 size_t
  38 O 4bb43633 4bb43633 long unsigned int
Write-multifile 2
Optimize-multifile
  14 O ab012a69 bf55db67 size_t
  1c O 4bb43633 4bb43633 long unsigned int
  267 O ab012a69 bf55db67 size_t
  26f O 4bb43633 4bb43633 long unsigned int
Read-multifile
  14 O ab012a69 bf55db67 size_t
  1c O 4bb43633 4bb43633 long unsigned int
Compressing 1 in finalize-multifile mode
  26 O ab012a69 bf55db67 size_t
  2e O 4bb43633 4bb43633 long unsigned int
Compressing 2 in finalize-multifile mode
  26 O ab012a69 bf55db67 size_t
  2e O 4bb43633 4bb43633 long unsigned int
...

So now we have in regular mode:
...
  38 O 4bb43633 4bb43633 long unsigned int
...
and in finalize-multifile mode:
...
  2e O 4bb43633 4bb43633 long unsigned int
...

And for size_t in regular mode:
...
  2d O ab012a69 bf55db67 size_t
...
and in finalize-multifile mode:
...
  26 O ab012a69 bf55db67 size_t
...

Also, we can see that actually the checksums are the same in optimize-multifile and read-multifile mode as well, so we could reuse there as well.

There will be DIEs for which we can't reuse the checksums, due to differences in handling references. I'm not sure for what percentage of DIEs this would apply.
Comment 1 Tom de Vries 2019-11-29 12:15:46 UTC
(In reply to Tom de Vries from comment #0)
> That difference is caused by handling of DW_FORM_strp, which is encoded into
> the checksum using the index to the string table rather than the string
> contents, which is a speed optimization for regular-mode (but which assumes
> that the input file is optimally encoded, in the sense that each unique
> string is encoded either using DW_FORM_strp, or DW_FORM_string, but not
> both).

FTR, naively disabling this optimization makes no difference in the result for cc1 and comes at a 5% execution time penalty:
...
series:  5630 5502 5510 5502 5521 5517 5521 5517 5486 5525
mean:  5523.10 (100%)
stddev:  39.37
series:  5877 5794 5832 5780 5811 5802 5769 5812 5845 5788
mean:  5811.00 (105.21%)
stddev:  32.62
user:
series:  5321 5326 5274 5258 5305 5319 5292 5300 5285 5299
mean:  5297.90 (100%)
stddev:  21.57
series:  5560 5546 5655 5531 5595 5554 5564 5568 5616 5555
mean:  5574.40 (105.22%)
stddev:  37.25
sys:
series:  308 176 236 244 216 196 228 216 200 224
mean:  224.40 (100%)
stddev:  35.60
series:  316 248 176 248 216 248 204 244 228 232
mean:  236.00 (105.17%)
stddev:  36.51
...