This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Discussion at Linux Foundation Japan Symposium


On Tue, Dec 23, 2008 at 01:13:06PM -0800, Roland McGrath wrote:

> I believe the actual reasons for slow progress are far more mundane than
> any dramatic intercommunity conflicts or failures to address the kernel
> development culture.  It's the variety of effects of having ~0.5 person
> ever work on the core enabling piece (utrace) that has to go in to
> enable other work.  (And 0.5 of a bit of a scatterbrain, at that.)  As
> I've seen it, there has been a great deal of hand-wringing and fret, and
> some amount of blame-seeking, but not a lot of straight-up collaborative
> hacking.  I consider this an organizational failure of the project's
> community and the organizations supporting it, of which I'm a member and
> a party to that failure.

How much more utrace work is needed?  Fedora is shipping it, right?
What still needs to be done before it can be merged?  The
characterization that I heard at the plumber's conference was that the
work was largely done, and it was just a matter of it getting merged.

/me refers to the utrace-devel mailing list...

Ah, I see, there are still some ptrace regression bugs.  All code has
bugs; heck, the ext4 code has had plenty of bugs, and all of my ext4
development happens on my own time, late at night or when I am on an
airplane.... 

May I gently suggest that if SystemTap project had been more
aggressively making SystemTap usable for kernel developers, and it had
an active user community that included may more LKML denizens, that
perhaps these bugs might have been fixed more quickly?  I don't know
how to make a better case that an attitude of "kernel developers don't
matter; our primary customers are enterprise end users" is in the end
quite self-defeating.

The other suggestion I would make is to work through your organization
and get someone like Ingo Molnar to spend a week or so going through
the utrace code.  He might be able to spot race conditions and other
problems quite quickly, and otherwise make suggestions for to get the
patch ready in time for the next merge window.  After all, if this is
going to make the next RHEL cutoff, you probably need to get the code
in via this merge window or the next one to make Fedora 11.  And the
fact that the code hasn't started down the LKML merge window is
something that if *I* were in Red Hat technical management, would make
me be highly concerned indeed, and I would be trying to engage some
expert resources to assure success, and not a last minute scramble.

> (It so happens this refocusing of my time is intended to jump-start
> the DWARF size reduction work.  I won't deny this may well be "12-18
> months out" for stable release deployments, but it can't even be
> that if it doesn't start sometime, and this is an approach to the
> debuginfo issues that most people other than Ted seem to agree could
> solve many important problems for systemtap.)

I agree it would solve many problems for SystemTap, I just don't see
it reaching fruition until RHEL 7, and there is such a thing as
missing the window, where projects get defunded by companies, and
kernel developers start working on alternate schemes (like ftrace and
LTTng) because they've given up on SystemTap.  I would probably be
making similar recommendations at my company, except that I see the
potential in SystemTap, and realize there are may things that these
alternate solutions would not be able to do --- for example, Userspace
Tracing, filtering log decisions in kernel space, etc.  It's just been
frustrating seeing a project with so much potential not living up to
that potential.

						- Ted


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]