This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Why two heap segments?


Mikael Djurfeldt <mdj@mdj.nada.kth.se> writes:

> Hi,
> 
> I just had a look at how the current CVS Guile allocates heap.
> 
> It actually starts up with two heap segments.
> 
> Why is that?

First heap isn't quite big enough? >:'). 

> Starting up with one heap segment which is sufficiently large for most
> common tasks must make Guile more efficient, or...?

It depends on how you mean more efficient (or common ;). Testing with
my current version (gc'ified 1.3.4), starting up with a 524228 cell
heap (- chunk headers, which eat up a little of that), I end up with
no gc's for startup (this includes syncase, so it's larger than just a
bare guile); if all that was needed was startup, do a little stuff and
exit, this would be perfect, provided you can handle an ~9 meg
process easily (only slightly less than my all day emacs,
actually). There are a few cons:

1) Of course, the size (personally, I'm rather appalled that perl
   requires almost 2 megabytes to run my simple connect script, but
   that's just me); I really don't think you'd want that as the
   default, since it is a bit much to have to throw at it when
   you only want to do something small.
        
2) Primitive, permanent objects are going to be scattered all over the
   heap; there currently isn't any way to totally avoid this, but a
   small initial heap segment can help tremendously in keeping system
   stuff (particularly boot-9 defined stuff) in a smaller region of
   the heap (though not really in a decent chronological order, but
   this is something that can't ever got completely away from with the
   conservative gc); without it, any garbage created in defining all
   those bits ensures that stuff is really scattered about, and given
   that the working set is pretty large, it's probably not going to
   play all that nicely with the cache. 

Some completely informal tests with the project page generator (which
conses way too much, it's true):

; For both, the subsequent segment sizes are 65536; 
> GUILEGC_INIT_SEG_SIZE=4096 time guile -e main -s genpage tmp.html greg
1.76user 0.05system 0:01.80elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (226major+406minor)pagefaults 0swaps

> GUILEGC_INIT_SEG_SIZE=524228 time guile -e main -s genpage tmp.html greg
1.56user 0.16system 0:01.71elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (226major+2236minor)pagefaults 0swaps

So, in overall `real' time, there isn't that big an improvement, and
the first is infinately less likely to start swapping (page faults are
also going to town in the second; this'll be worse in cvs guile, since
I'm running with the lazy sweeper and the mark stack stuff). Playing
around with scwm (a little while back) showed pretty much the same
kind of behaviour; while there's a definite benefit if everything you
need can fit into the available heap (provided you aren't talking on
the order of 100 megabytes or so, where you are going to be absolutely
swallowed alive by accessing it all), it's exceedingly rare that you
can figure out, before some testing, how much you really need, so any
increased default value isn't all that likely to produce generally
better performance (since, once you hit that collection, it's all
going to catch up with you; it is going to take more effort to scan 10
megabytes once than to scan 1 megabyte 10 times, provided that the one
megabyte isn't nearly full all the time).

Of course, the best solution is to make it tunable ;). There's an
older patch at
http://home.thezone.net/~gharvey/guile/gc-alloc-fix.diff; I don't know
if it'll go against the current cvs cleanly (I really haven't had any
time to keep up over the past few months; actually, I'm supposed to be
doing an assignment right now ;'); I'm pretty sure that doesn't
include the ability to tune the threshold for more heap allocation,
though, which is also very useful.

-- 
Gregh

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]