[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug default/19427] Intern the strings used in Libabigail



https://sourceware.org/bugzilla/show_bug.cgi?id=19427

dodji at redhat dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

--- Comment #1 from dodji at redhat dot com ---
So I started to work on this and I do have a working branch (named
'str-intern') with the necessary changes undocumented (yet).  You can browse it
at
https://sourceware.org/git/gitweb.cgi?p=libabigail.git;a=shortlog;h=refs/heads/dodji/str-intern.

I am going to post time and memory consumption comparison using the code base
that is built with optimization (-O2).

So here is the resource usage of abidw --abidiff on r300_dri.so, for the master
branch:

real => 5:03.31
user => 300.24
sys => 2.86
max mem => 4959036KB

And the resource usage for the str-intern branch:

real => 4:56.98
user => 294.12
sys => 2.65
max mem => 4617328KB

So, as you can see, it slightly improves the speed of this test (by 6 seconds),
and significantly improves memory usage (saving more than 300 mega bytes of
memory).

The problem is that, of smaller tests and tests that don't involve emitting
abixml, things are a little bit slower, actually.  In other words, abidiff, for
instance, becomes slightly slower.  The memory consumption savings are still
there though.

That is, the cost of looking up strings in a hash table to ensure that each
string exists in only one copy in the environment (this is string interning)
makes the loading of abi corpora slower.  But then, comparing *strings* later
becomes faster as comparing two strings amounts to just comparing two pointers.
 But we need to compare a lot of strings to make up for the cost of interning
them in the first place.  And the place where we compare strings the most at
the moment is when we emit abixml (i.e, in abidw).

During decls comparisons it turns out we don't compare strings that much
because we compare their types first.  And thanks to type canonicalization,
comparing two types is very fast.  And as the majority of comparisons yield a
negative result, we don't even get to compare the names of the decls.

So I am still not sure if I am going to incorporate this optimization in the
end.  I *am* inclined to merge it, because it makes the library consume less
memory, and it speeds up abixml writing, especially for big libraries.  In
other words, it makes libabigail scale more.  But then it slows it slightly on
small workloads (which are quite fast anyway).

I'll give this a little bit more thought.

But in the mean time, if you have some thoughts, please share them :-)

-- 
You are receiving this mail because:
You are on the CC list for the bug.