This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: offline elfutils processing committed


On Tue, 2006-11-07 at 09:34 -0500, Frank Ch. Eigler wrote:
> Hi -
> 
> hunt wrote:
> 
> > [...]  What happens when systemtap's allocations succeed, but leave
> > the system in a low memory state such that other applications
> > trigger the oom killer when they try to allocate memory.  In this
> > case, we want staprun and the systemtap module to be first to be
> > killed. [...]
> 
> Since the systemtap module rather than staprun owns most of the
> memory, and because by its nature the module reacts relatively slowly
> to staprun's demise, biasing staprun for OOM targeting may not
> meaningfully assist the system in a time of need.

Right now I have staprun getting SIGKILL from __oom_kill_task and it
signals the end probe functions to run before unloading the module. If
we decide this is too slow a reaction, we can always just unload the
module immediately. We certainly don't want to depend on the code in the
module that periodically checks for staprun's existence.

> Also, preferring to kill staprun/etc. under such conditions is not
> obviously correct.  One might argue that once a systemtap script is
> running, it deserves to be kept alive no less than any other process:
> it may be running precisely because the sysadmin wanted to monitor the
> system.  Heck, it might be in the middle of debugging excessive memory
> consumption problems.

That's a good point, although it is hard to imagine sysadmins would
often prefer to trust oom-killer to randomly kill processes rather than
remove systemtap scripts.  Perhaps we need a command line option to set
that. 

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]