begin and end probes are now interruptible. This creates a problem with IO because we use percpu buffers. A possible solution would be to have the translator add a call to smp_processor_id() to the probe prologue and modify the print functions to take a cpu input instead of each calling smp_processor)id().
Can you spell out why you believe there is a problem? begin/end probes are only run when no kprobes etc. are even registered. They do not run in parallel with each other. So I see no source of concurrency/reentrancy (other than the shared-buffer guest/host case I mentioned in the other bug).
Plus remember, we still block preemption during begin/end probes.
OK, this is not something we normally run into. It's easy to force the situation by calling a sleeping function in a begin probe. This is not something we normally do, but we might, so we should handle it correctly or prevent it. If the begin/end probe does sleep or get interrupted, it might not get the same cpu when it resumes, which means the output buffer will change without the previous buffer getting flushed and data will be lost. IIRC, one of the reasons we enabled preemption was to handle situations where large arrays are dumped during end probes. If interrupts are disabled, stapio cannot process the output while it is arriving and the module runs out of relayfs buffers.
(In reply to comment #3) > calling a sleeping function in a begin probe. That remains invalid, even with the slight relaxing we did earlier. > If the begin/end probe does sleep or get interrupted, it might > not get the same cpu when it resumes, Interrupts do not cause a reschedule or cpu reassingnment if preemption was blocked (as it is). > which means the output buffer will change without the > previous buffer getting flushed and data will be lost. If this is the limit of what can happen when a user schemes to call a sleeping function (which can only happen in "-g" mode anyway), then this problem is a minor one. > IIRC, one of the reasons we enabled preemption was to handle situations where > large arrays are dumped during end probes. That was only a side-effect. We wanted to produce "long" reports, for "long" meaning perhaps thousands of lines. It's not megabytes. With the status quo, SMP machines may be able to keep up with even that.
I'm not clear about whether there actually exists a bug described in comment #0, in that we do not enable sleeping or preemption during begin/end probes, so we can't switch CPUs.
(In reply to comment #5) > I'm not clear about whether there actually exists a bug described > in comment #0, in that we do not enable sleeping or preemption > during begin/end probes, so we can't switch CPUs. We don't? News to me. It's easy to test. >cat test.stp function in_interrupt:long () %{ /* pure */ THIS->__retvalue = in_interrupt(); %} function irqs_disabled:long () %{ /* pure */ THIS->__retvalue = irqs_disabled(); %} function preempt_count:long () %{ /* pure */ THIS->__retvalue = preempt_count(); %} function print_info () { printf("%s\n", pp()) printf("cpu: %d\n", cpu()) printf("in_interrupt:%d, irqs_disabled:%d, preempt_count:%d\n", in_interrupt(), irqs_disabled(), preempt_count()); print("------------------\n\n") } probe begin { print_info() } >stap test.stp begin cpu: 2 in_interrupt:0, irqs_disabled:0, preempt_count:0
(In reply to comment #6) > (In reply to comment #5) > > I'm not clear about whether there actually exists a bug described > > in comment #0, in that we do not enable sleeping or preemption > > during begin/end probes, so we can't switch CPUs. > > We don't? News to me. It's easy to test. > [...] > in_interrupt:0, irqs_disabled:0, preempt_count:0 Thanks for running the test. The preempt_count should not be 0. The generated code in enter_begin_probe clearly contains a preempt_disable(), so the question is why this does not appear to have effect.
Maybe your kernel built without CONFIG_PRREMPT. My f8 kernel only has CONFIG_PREEMPT_VOLUNTARY and CONFIG_PREEMPT_BKL.
(In reply to comment #6) > (In reply to comment #5) > > I'm not clear about whether there actually exists a bug described > > in comment #0, in that we do not enable sleeping or preemption > > during begin/end probes, so we can't switch CPUs. > > We don't? News to me. Unless you have a scenario that could explain cpu switching, or an actual test case that demonstrates it occurring, let's please close this bug.
> Unless you have a scenario that could explain cpu switching, or an > actual test case that demonstrates it occurring, let's please close > this bug. Something could only happen now if code included using "-g" slept during a begin or end probe. The only reason to ever do so that I can think of is because we decide to change IO to timeout and retry on error during end probes. And the reason to do that is because in extreme cases, on single-cpu systems, the total data collected in arrays and being dumped in the end probe exceeds the free space in the relayfs buffer. So we have to free the cpu and let stapio read from the relayfs buffer to continue. Such a situation would be rare and we could just say the proper fix is to simply use "-s" to set the buffer size larger. I'm fine with that, at least for now. We have bigger fish to fry.