This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gold][patch] Reduce heap usage for string merge sections


>>> The cost of overestimating is not that high, so I would just assume that
>>> each string has, say, 8 characters.
>>
>> It turns out that the cost is actually quite substantial. For
>> .debug_str, if the average string is, say 64 bytes, an estimate based
>> on an average of 8 would reserve 8 times as much memory as we actually
>> need. On my benchmark, adding the reserve call made gold use about 3
>> times as much memory for the Merged_string structures.
>>
>> Instead of using a fixed average string size, I could sample the first
>> N bytes of the section to come up with the estimate. How does that
>> sound?
>
> Hmmm, clearly my intuition here is wrong. ?I'm not sure what is best.

The average string size for .rodata string merge sections (in my
benchmark) is 13.5 chars; for .debug_str sections, it's 38.3 chars. I
went with a fixed estimate for number of strings as (len / 32 + 8),
and that improved heap usage over not reserving at all by about 7% (90
MB out of 1.3 GB).

The dangers here are:

- If we underestimate the count of strings slightly, we risk always
ending up using almost 2x as much memory as we need, because
std::vector will double the capacity just before we add the last
entry.

- If we overestimate the count wildly, we risk using many times as
much memory as we'd need.

I'm worried that any fixed number I choose would be too benchmark
specific, so I'm trying something more adaptive. I'll let you know how
that works out.

By the way, changing the clear() to delete made a big difference --
clear() doesn't actually free any memory. So I think we can free up
even more memory by changing the merged_strings_lists_ field to a
pointer, and deleting that as well when we're done with it. (I took a
quick scan for other places where gold calls vector::clear(), but
didn't see any other obvious places where we'd want to delete the
vector. I still need to take a closer look, though.)

-cary


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]