This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Improving libm-test.inc structure and maintenance
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: "Joseph S. Myers" <joseph at codesourcery dot com>
- Cc: OndÅej BÃlka <neleai at seznam dot cz>, libc-alpha at sourceware dot org
- Date: Thu, 09 May 2013 09:05:46 -0400
- Subject: Re: Improving libm-test.inc structure and maintenance
- References: <Pine dot LNX dot 4 dot 64 dot 1305022244550 dot 12072 at digraph dot polyomino dot org dot uk> <20130505130414 dot GA18328 at domone dot kolej dot mff dot cuni dot cz> <Pine dot LNX dot 4 dot 64 dot 1305051340490 dot 16386 at digraph dot polyomino dot org dot uk> <20130505165832 dot GA30896 at domone> <Pine dot LNX dot 4 dot 64 dot 1305052006330 dot 16386 at digraph dot polyomino dot org dot uk> <20130509095721 dot GA28753 at domone dot kolej dot mff dot cuni dot cz> <Pine dot LNX dot 4 dot 64 dot 1305091143550 dot 27366 at digraph dot polyomino dot org dot uk>
On 05/09/2013 08:18 AM, Joseph S. Myers wrote:
> On Thu, 9 May 2013, Ondrej Bilka wrote:
>
>> On Sun, May 05, 2013 at 08:23:12PM +0000, Joseph S. Myers wrote:
>>> On Sun, 5 May 2013, Ondrej Bilka wrote:
>>>
>>>> You do not have to review if you do following:
>>>
>>> Tools may be able to use various heuristics to reduce the number of cases
>>> presented for human review. That human review is still needed to ensure
>>> good, valid bug reports. (Note that Jakub found various bugs in MPFR in
>>> his random fma testing. You need to decide what component the bug is in
>>> before reporting it.)
>>
>> Depends on what is found. If it founds only 10 cases in year then
>> filtering is not necessary. My main concern is that when testing finds
>> new bug (Which can be needle in haystack of existing bugs) then everybody
>> forgotten that it took place and did not read logs. Some notification system
>> is necessary.
>
> Frankly, we have more need right now - much more need - for people working
> on fixing bugs than for systems detecting and filing new bugs that have
> not affected any human enough for them to file the bugs. I'd urge working
> on fixes for existing bugs in libm or any other part of glibc over new
> bug-finding systems, until the number of open bugs is much smaller than at
> present.
Fully agree.
> Few people have been interested in joining me in the patch-a-day goal,
> with a reasonable proportion of those patches being bug fixes, for
> improving glibc and dealing with the backlog of known issues. Recruit ten
> more people who actively and accurately triage new bugs on a day-by-day
> basis and work daily on fixing bugs, and your approach of more automatic
> reporting to glibc Bugzilla may become more feasible. Without those
> people, it's likely to be harmful rather than helpful to glibc development
> - even if the new bugs are in fact valid and not duplicates.
Agreed.
> Given the extremely limited resources presently spent on bug fixing and
> triage, it's important to ensure new bugs reported are of high quality so
> those resources are productively spent improving glibc rather than dealing
> with poor-quality, incorrect or duplicative bug reports.
Agreed.
>> Bugzilla is best place for notification. Second alternative is send mail
>> which has higher probability of being ignored.
>
> Any automatic tester should notify *the person running the tester*. That
> person should then take responsibility for understanding the notifications
> and producing reports on the human window in glibc Bugzilla where there
> are genuinely new bugs. It's the responsibility of the person running the
> tester to deal with notifications or to find someone to do so, rather than
> dumping them directly into Bugzilla without human review. If you don't
> have the human resources to review the output of your system and produce
> good human bug reports from it, then at most put information on an
> external site and a link on the wiki to where people can find those
> external reports if they wish to look for new glibc bugs among them - but
> it will probably be largely ignored because there are too many *human* bug
> reports for the present level of work on bug fixing, even without new
> sources of potential bugs.
Agreed.
>>> I'm thinking more on the lines of John Regehr's testing of compilers with
>>> Csmith. Reporting one bug doesn't wait on other bugs being fixed if it
>>> looks to a human that they are different. Failures appearing in different
>>> functions may have the same underlying cause, while failures in the same
>>> function may have different causes - that's something a human can judge.
>>>
>> In libm functions are mostly standalone, same underlying cause can
>> happen only by pattern which is repeated in code. Then having list of
>> functions affected is handy.
>>
>> I do not quite follow how you use testing with Csmith. Generate random
>> expressions and look how functions behave?
>
> See the bugs he's reported to GCC Bugzilla over the years - human bug
> reports, with reduced testcases - and his blog, and the papers he's
> published about finding bugs through random testing.
>
> Before working on finding glibc bugs through such random testing, it would
> be a very good idea to (a) study the existing literature in the area -
> such work should be considered as much a piece of potentially publishable
> research, as a direct contribution to glibc, and should be approached
> accordingly - and (b) pay close attention to what the people who are
> actually fixing such bugs as you might hope to find say they find is
> useful regarding reporting them, rather than starting from external
> assumptions about how you would like to handle reporting bugs, just as
> John Regehr has paid attention to reporting bugs in ways that are useful
> to the projects to which he reports them (rather than just dumping the
> original large, unreduced and unreviewed tests into Bugzilla, for
> example).
Agreed.
>>> I think automatic bug filing is always a bad idea - an automatic process
>>> may produce a list of *candidate* issues, tracked however is convenient,
>>> but the human should be in the loop before any such candidate issue
>>> becomes an actual bug report in glibc Bugzilla, not just after.
>>>
>> What about adding separate state for example GENERATED that will not
>> show unless asked.
>
> In the absence of more bug triagers and fixers, a completely separate
> tracking system should be used for automatically-generated candidate
> issues like this, not glibc Bugzilla until a human has reviewed them and
> decided they are genuine and new glibc bugs. Again, get more people
> working on bug fixing and triage, and the appropriate approaches may
> change, but get the extra people contributing *first* before dumping
> lower-quality bugs in Bugzilla.
Agreed, a distinct system for automated bugs would be required.
>>> Automatic closing of bugs is also a bad idea; a human needs to judge
>>> whether the whole issue is genuinely fixed or whether the commit only
>>> fixes particular cases and other parts of the same issue remain to fix.
>>>
>> A test that tests only particular cases is inadequate test. You can not
>> decide if issue is fixed with tests that are green before and green
>> after. You also do not reliably know if regression happened. Closing
>> bug is good way to fix it and make human add additional neccessary data.
>
> Automatic systems are there as the servant of humans, not their master.
> "make human add" is fundamentally the wrong idea. If no-one is paying
> attention on a particular day when a computer detects that an issue might
> be fixed (given that the issue was reported / reviewed as valid by a human
> in the first place), the issue should remain open until someone is looking
> at it and can review the notification; it should not be quietly closed
> without that review. With extra bug reviewers, waiting for human review
> is not a burden here. Without extra bug reviewers to notice errors,
> closing a bug when it may not be properly fixed is actively destructive
> and harmful to glibc.
Agreed.
> It's impossible in advance to write a test that covers all cases, because
> until the issue has been analyzed and fixed you don't know how many
> instances of the issue appear in different places in the code, but it is
> possible to write one that covers at least one failing case, with the
> understanding that a human will need to check when it starts to pass and
> decide if the issue is really fully fixed.
Agreed.
>> I plan write something like this but currently do not have that much time.
>> I added it to my TODO list and probably will look in freeze.
>>
>> Everybody would be welcome to join. What are options where to host it?
>
> I suggest Savannah for GNU-related free software projects. But as above,
> I advise (a) fixing existing bugs as higher priority than systems to find
> new ones; (b) understanding what people who have gone through and fixed
> hundreds of bugs in Bugzilla actually find useful and working based on
> that experience to optimize things for the people who fix bugs rather than
> optimizing for the person running an automatic system to find them; (c)
> understanding the existing literature and experiences with random testing,
> with a view to possibly making a publishable contribution to that
> literature.
>
> If you do not have that much time, any one bug fix is a valuable
> contribution to glibc and likely to be much more practical than starting a
> substantial research project on random testing. So is triage of existing
> bugs to identify if they are valid, non-duplicative and still applicable
> to current glibc.
If you don't have much time these bugs need help getting checked in:
http://sourceware.org/bugzilla/show_bug.cgi?id=15403
http://sourceware.org/bugzilla/show_bug.cgi?id=15122
http://sourceware.org/bugzilla/show_bug.cgi?id=15431
http://sourceware.org/bugzilla/show_bug.cgi?id=15432
Cheers,
Carlos.