This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Making the transport layer more robust - Power Management - follow-up


On Thu, 2011-12-08 at 21:22 +0000, Turgis, Frederic wrote:
> - in same thread, Mark made the remark that STP_RELAY_TIMER_INTERVAL
> and STP_CTL_TIMER_INTERVAL (kernel pollings) are in fact tunables, no
> need to modify the code:
>    * to match our work-arounds, we used then -D
> STP_RELAY_TIMER_INTERVAL=128 -D STP_CTL_TIMER_INTERVAL=256
>    * regular wake-ups are clearly occuring less often, with no tracing
> issue. But our trace bandwidth is generally hundreds of KB/s max so we
> don't really need much robustness

I noticed these aren't documented anywhere. I propose to document them
as follows:

STP_RELAY_TIMER_INTERVAL How often the relay or ring buffers are checked
to see if readers need to be woken up to deliver new trace data. Timer
interval given in jiffies. Defaults to "((HZ + 99) / 100)" which is
every 10ms.

STP_CTL_TIMER_INTERVAL How often control messages (system, warn, exit,
etc.) are checked to see if control channel readers need to be woken up
to notify them. Timer interval given in jiffies. Defaults to "((HZ
+49)/50)" which is every 20ms.

Where should we add this documentation?

> Some more recent findings:
> - while testing fixes on some ARM backtrace issue with Mark, I got
> message "ctl_write_msg type=2 len=61 ENOMEM" several times at
> beginning of test (not root-caused yet). That means lack of trace
> buffer for msg type=2, which is OOB_DATA (error and warning messages).
> Test and trace data looked fine. Messages do not appear if I compile
> without -D STP_CTL_TIMER_INTERVAL=256.

Yes, that is kind of expected. The control messages really want to be
delivered and if you wait too long new control messages will not have
room to be added to the buffers.

Would it help you if we made the pool reserved memory buffers also
tunable? Currently STP_DEFAULT_BUFFERS is defined staticly in either
runtime/transport/debugfs.c (40) or runtime/transport/procfs.c (256)
depending which backend we use for the control channel.

Documentation would be something like:

STP_DEFAULT_BUFFERS Defines the number of buffers allocated for control
messages the module can store before they have to be read by stapio.
Defaults to 40 (8 pre-allocated one time messages plus 32 dynamic
err/warning/system messages).

> - last non tunable wake-up is timeout of userspace data channel
> ppoll() call in reader_thread(). Without change, we wake-up every
> 200ms:
>    * we currently set it to 5s. No issue so far
>    * Mark (or someone else) suggested to use bulkmode. Here are some
> findings:
>       + bulkmode sets timeout to NULL (or 10s if NEED_PPOLL is set).
> It solves wake-up issue. I am just wondering why we have NULL in
> bulkmode and 200ms otherwise

That is probably because not all trace data backends really support
poll/select. The ring_buffer one seems to, but the relay one doesn't. So
we would need some way to detect whether the backend really supports
select/poll before we can really not use any timeout. If there isn't a
bug report about this, there probably should. Will's recent periodic.stp
example showed stap and the stap runtime are responsible for a noticable
amount of wakeups.

>       + OMAP hotplugs core so generally core 1 is off at beginning of
> test. Therefore I don't get trace of core1 even if core1 is used
> later. Makes bulkmode less usable than I thought (at least I still
> need to test with core1 "on" at beginning of test to see further
> behaviour)

Could you file a bug report about the systemtap runtime not noticing new
cores coming online for bulk mode?

> That makes the possibility to tune ppoll timeout value in non bulkmode
> still interesting. I even don't really know what could be consequences
> of directly setting to 1s or more but tunable would be good trade-off
> that does not break current status.
> 
> Well, I think I gave myself few actions to perform !

Thanks for the feedback. Please let us know how tuning things
differently make your life easier.

Cheers,

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]