This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: consistent printf race conditions


On 10/24/2012 01:40 PM, Nicholas Murphy wrote:
> I've been consistently seeing the following race condition: a printf from a close() system call pre-empts printing of a sys_wait4 hook.  Here's the sys_wait4 printf code:
> 
> printf("{\"execname\":\"%s\",\"op\":\"%s\",\"options\":%d,\"pid\":%d,\"ppid\":%d,\"timestamp\":%d,\"uid\":%d}\n",clean_string(execname()),clean_string("WAITPID_CALL"),$options,$pid,ppid(),gettimeofday_ms(),uid())
> 
> ...and here's an example of what I end up seeing printed:
> 
> {"execname":"bash","op":"WAITPID_CALL","options":10,"pid":-1,"ppid":13561,"timestamp":134{"execname":"bash","fd":3,"op":"CLOSE","pid":15121,"ppid":13562,"return":0,"timestamp":1347749678388,"uid":56441}
> 
> I don't see anything in the printf that would cause it to be pre-empted by another printf.  Any idea why I'm seeing this and what I could do to avoid it?  It seems to always be waitpid conflicting with close, so I could perhaps add an access to a common associative array to force synchronization between the two?

Can you provide a more complete script which demonstrates the error,
with at least the two close and wait probes?  And what sort of workload
do you run where it consistently fails?

I don't know why it would do this, but it will help if I can reproduce
the corruption locally.

It would also help if you could attach the output of stap-report.

Thanks,
Josh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]