[patch/rfc] Remove all setup_xfail's from testsuite/gdb.mi/

Thu Jan 16 19:03:00 GMT 2003

> I don't think making it a requirement that go out and analyze all the
> existing XFAILs is reasonable, although it is patently something we
> need to do.  That's not the same as ripping them out and introducing
> failures in the test results without addressing those failures.

>> As a specific example, the i386 has an apparently low failure rate. 
>> That rate is badly misleading and the real number of failures is much 
>> higher :-(  It's just that those failures have been [intentionally] 
>> camoflaged using xfail.  It would be unfortunate if people, for the 
>> i386, tried to use that false result (almost zero fails) when initally 
>> setting the bar.
> 
> 
> Have you reviewed the list of XFAILs?  None of them are related to the
> i386.  One, in signals.exp, is either related to GDB's handling of
> signals or to a longstanding limitation in most operating system
> kernels, depending how you look at it.  The rest are pretty much
> platform independent.

I've been through the files and looked at the actual xfail markings. 
They are dominated by what look like cpu specific cases (rs6000 and HP 
are especially bad at this).

I've also noticed cases where simply hanking the xfail doesn't make 
sense - when the failure has already been analized (easy to spot since 
they are conditional on the debug info or compiler version).

>> This is also why I think the xfail's should simply be yanked.  It acts 
>> as a one time reset of gdb's test results, restoring them to their true 
>> values.   While this may cause the bar to start out lower than some 
>> would like,  I think that is far better and far more realistic than 
>> trying to start with a bar falsely set too high.
> 
> 
> This is a _regression_ testsuite.  I've been trying for months to get
> it down to zero failures without compromising its integrity, and I've
> just about done it for one target, by judicious use of KFAILs (and
> fixing bugs!).  The existing XFAILs all look to me like either
> legitimate XFAILs or things that should be KFAILed.  If you're going
> to rip up my test results, please sort them accordingly first.

No one is ripping up your individual and personal test results.

Several years ago some maintainers were intentionally xfailing many of 
the bugs that they had no intention of fixing.  That was wrong, and that 
needs to be fixed.

An unfortunate consequence of that action is that the zero you've been 
shooting for is really only a local minimum.  The real zero is further 
out, that zero was a mirage :-(

> It doesn't need to be done all at once.  We can put markers in .exp
> files saying "xfails audited".  But I think that we should audit
> individual files, not yank madly.

(which reminds me, the existing xfail reference to bug reports need to 
be ripped out - they refer to Red Hat and HP bug databases :-().

> Am I the only one who considers well-categorized results important?

Of course not.  All the good developers on this list take the test 
results, and their analysis, very seriously.

>  If
> you introduce seventy failures, then that's another couple of weeks I
> can't just look at the results, see "oh, two failures in threads and
> that's it, I didn't break anything".

People doing proper test analysis should be comparing the summary files 
and not the final numbers.  A summary analysis would show 70 XFAIL->FAIL 
changes, but no real regressions.

Anyway,

If the eixsting (bogus) xfail PR numbers are _all_ ripped out, and then 
the requirement for all new xfail's to include a corresponding bug 
report, I think there is a way forward.

Andrew