This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: Latest XSLTMark benchmark
- To: xsl-list at lists dot mulberrytech dot com
- Subject: Re: [xsl] Latest XSLTMark benchmark
- From: Uche Ogbuji <uche dot ogbuji at FourThought dot com>
- Date: Wed, 04 Apr 2001 16:51:58 -0600
- cc: David_N_Bertoni at lotus dot com, "Xsltmark at Datapower dot Com" <xsltmark at datapower dot com>
- Reply-To: xsl-list at lists dot mulberrytech dot com
> > You did not cover 4XSLT then, nor did any one of you contact the
> > 4Suite list to ask about realistic test driver methodology.
>
> The previous point was about Xalan, in response to David Bertoni's
> concerns. It is true that the 4XSLT driver is a new addition in
> this release, perhaps we should have put it in quarantine. Of
> course, we have also had some folks contact us since the XML.com
> article complaining that particular processors were "excluded" --
> so perhaps including 4XSLT should be viewed as a good thing.
How on earth could it be a good thing? The people who wrote in are likely
complaining because they expect that the XSLTMark is a properly controlled
test. It does them a great disservice to release test with shoddy and obscure
procedures.
It's better not to release benchmarks at all if the necessary standards of
rigor are unattainable. There is a good reason why respectable benchmarking
tend to lag technologies by years.
> While I wasn't closely involved, I believe the driver is a
> modified example from the 4XSLT distribution.
The driver bears no resemblance to anything we've written in any example.
It's quite inscrutable, in fact, and seems rather convoluted. If you had a
clear statement of benchmark architecture I might be interested in writing a
clean match, but I hardly have the time to read the mind of whomever wrote the
current harness.
It's a rather rare event for me to have any difficulty reading someone else's
Python code, but that benchmark driver is quite an odd bit of code.
> More to the point, in the case of 4XSLT we *did* ask for help
> and, as a matter of fact, in the process one of our engineers
> located three bugs in 4XSLT for which you yourself thanked him!
>
> http://lists.fourthought.com/pipermail/4suite/2001-February/001444.html
I never said 4XSLT doesn't have bugs. I'm glad you found no more than 3. But
None of the messages you point to are particularly relevant to the resulting
benchmark.
In fact, your engineer was asking about omniORB. OmniORB has nothing
whatsoever to do with 4XSLT, which just strengthens my impression that the
benchmark driver was constructed in some ignorance.
Was there any reason not to simply post the driver and ask "is this a fair way
to harness 4XSLT for a benchmark"? It's all OSS, right? Full disclosure and
all that.
> I am not sure that XSLTMark itself was mentioned -- but presumably
> the advice is the same for normal operation and for benchmarking!
> If there is something we are doing wrong, please let us know. As
> Michael Kay pointed out a few weeks ago on this very list, it is
> not surprising that "one tends to do best in one's own benchmarks".
>
> (See http://lists.fourthought.com/pipermail/4suite/2001-February/author.html
> for the full exchange of messages between a DataPower engineer and
> 4XSLT folks).
Again, that exchange is simply irrelevant to the results of the benchmarks.
Perhaps if the queries were more direct, all this mess would have been
avoided. We can only answer the questions that are asked. We don't read
minds.
> > Your lack of documentation of method, constraints, restrictions,
> > architecture, environment, etc. is *highly* unprofessional.
>
> Actually, it is documented. And all the source code is open and
> available so anyone can:
>
> 1. review it for proper behavior
> 2. independently reproduce the results
> 3. use it for internal testing (nice of us, no?)
No. You have source code. That's only useful for one who wants to track down
every idiosyncracy of every driver you have written. You have no general
statement of principles. You don't indicate the controls you try to maintain.
You don't point out places where those controls are unmanageable, and how
this might statistically affect your benchmark. Hell, you don't even have a
statistic confidence level for your results.
This is not the scientific method. It's pretty much voodoo benchmaarking.
Saying "we provide all the code" is not very useful.
> Now, documentation aside, is there actually something wrong with
> the driver? If you have specific problems with the 4XSLT driver,
> please let us know (at xsltmark@datapower.com), and we will set
> it right.
If you can clearly express what it is you are testing, what you need to
control, what environmental constraints you are using (everything from OS to
Machine architecture to Vendor and Language version), then I can perhaps
construct a useful driver for you.
I have none of that information, so without diggin deeply into your Java test
harnesses, I can't provide a fair driver. This is because I take such
normalization seriously for benchmarking. You need to do so as well.
I must note that as you yourself have admitted, you don't even have reasonable
normalization among drivers within a single environment (Java).
--
Uche Ogbuji Principal Consultant
uche.ogbuji@fourthought.com +1 303 583 9900 x 101
Fourthought, Inc. http://Fourthought.com
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list