This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2] Improve debugging for manual safety annotations
- From: ricaljasan <ricaljasan at pacific dot net>
- To: Juan Manuel Torres Palma <j dot m dot torrespalma at gmail dot com>
- Cc: aoliva at redhat dot com, libc-alpha at sourceware dot org
- Date: Thu, 10 Mar 2016 01:48:49 -0800
- Subject: Re: [PATCH v2] Improve debugging for manual safety annotations
- Authentication-results: sourceware.org; auth=none
- References: <20160309231618 dot GA29413 at randy-betty dot homestation>
On 03/09/2016 03:16 PM, Juan Manuel Torres Palma wrote:
> hello glibc folks,
>
> I have submitted and pinged this patch for quite a while already but I get no
> feedback on it.
I'll try to help.
> diff --git a/manual/check-safety.sh b/manual/check-safety.sh
> index 2eba000..a3e08fe 100644
> --- a/manual/check-safety.sh
> +++ b/manual/check-safety.sh
> @@ -24,23 +24,48 @@
> # an explicit reason and when there's a reason for unsafety it's not
> # safe, and that there aren't duplicates remarks.
>
> -
> +# Set to ":" if no error was found, and to "false" if found.
> success=:
Really, initialized to ":", but maybe that's too nit-picky. I would say
something like, 'Initialize to ":", set to "false" on error.'
> +# Gets the name of the file and line where an error was found.
> +error_ln=
Maybe "holds" here as well? Would be consistent with below, and more
correct.
> +
> +# Holds the error message for an error.
> +error_msg=
> +
> # If no arguments are given, take all *.texi files in the current directory.
> test $# != 0 || set *.texi
> -# FIXME: check that each @deftypefu?n is followed by a @safety note,
> -# with nothing but @deftypefu?nx and comment lines in between. (There
> -# might be more stuff too).
> +
> +# Function to check errors and set $success.
> +check_and_set_error ()
> +{
> + if [ -n "$error_ln" ]
> + then
> + echo "$error_ln:Error $error_msg"
Should be "$error_ln: $error_msg". Needs a space after the colon and
"Error" wouldn't be capitalized in any case, but simply using the
uncapitalized message below is correct, per GNU Coding Standards [1].
> + success=false
> + fi
> +}
> +
> +
> +# Check that each @deftypefu?n is followed by a @safety note,
> +# with nothing but @deftypefu?nx and comment lines in between.
> +# Also indexes are allowed.
> +error_ln=$(awk -f chk-typefun.awk "$@")
> +error_msg="unexpected tag between @deftypefun and @safety."
No period at end of message [1].
> +check_and_set_error
>
>
> # Check that all safety remarks have entries for all of MT, AS and AC,
> # in this order, with an optional prelim note before them.
> -grep -n '^@safety' "$@" |
> +error_ln=$(grep -n '^@safety' "$@" |
In the awk script, you matched on "^@safety{", which seemed like a wise
choice.
> grep -v ':@safety{\(@prelim{}\)\?@mt\(un\)\?safe{.*}'\
> -'@as\(un\)\?safe{.*}@ac\(un\)\?safe{.*}}' &&
> -success=false
> +'@as\(un\)\?safe{.*}@ac\(un\)\?safe{.*}}' |
I wonder if there would be any benefit to being stricter here, and
anchoring the end. Nothing in manual/*.texi violates it at the moment,
though.
Also, there aren't any preliminary comments anywhere in the manual that
would currently require matching @prelim{.*}, but I've always wondered
about it. manual/macros.texi adds an ugly, useless colon (useless
because it ignores any comment, which means the colon is always
immediately followed by a pipe), but it is still technically possible to
provide a comment in the @prelim macro even though it won't be rendered,
so it is valid to have "@prelim{foo}" somewhere.
> +cut -d':' -f1,2)
> +
> +error_msg="safety marks are in incorrect order (MT, AS, AC)."
Or something was absent, etc. I would just say, "Invalid @safety
command", because the test above is pretty broad.
Regardless, no period [1].
> +check_and_set_error
> +
> # Check that @mt-started notes appear within @mtsafe or @mtunsafe,
> # that @as-started notes appear within @assafe or @asunsafe, and that
> @@ -49,76 +74,108 @@ success=false
> # unsafe), but let @mt have as, ac or asc before [su], and let @as
> # have a c (for cancel) before [su]. Also make sure blanks separate
> # each of the annotations.
> -grep -n '^@safety' "$@" |
> +error_ln=$(grep -n '^@safety' "$@" |
I like "^@safety{".
> grep -v ':@safety{\(@prelim{}\)\?'\
Same caveat for @prelim.
> '@mt\(un\)\?safe{\(@mt\(asc\?\|ac\)\?[su][^ ]*}\)\?'\
> '\( @mt\(asc\?\|ac\)\?[su][^ ]*}\)*}'\
> '@as\(un\)\?safe{\(@asc\?[su][^ ]*}\)\?'\
> '\( @asc\?[su][^ ]*}\)*}'\
> '@ac\(un\)\?safe{\(@ac[su][^ ]*}\)\?'\
> -'\( @ac[su][^ ]*}\)*}}' &&
> -success=false
> +'\( @ac[su][^ ]*}\)*}}' |
> +cut -d':' -f1,2)
For the record, I did analyse this, despite it being unchanged, and it
looks reasonable, and nothing currently violates it.
> +
> +error_msg="tags are uncorrectly set. Check that every "\
incorrectly
> +"remark is placed in the right tag."
[1] doesn't explicitly address multiple phrases/sentences in the
message, but I'd assume "Check" is correct here, in which case both
periods would be correct, because there is a conceptual sentence
(although [1] seems adamant about not ending with a period). Thoughts
otherwise?
> +check_and_set_error
>
> # Make sure safety lines marked as @mtsafe do not contain any
> # MT-Unsafe remark; that would be @mtu, but there could be as, ac or
> # asc between mt and u.
> -grep -n '^@safety.*@mtsafe' "$@" |
> -grep '@mt\(asc\?\|ac\)?u' "$@" &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@mtsafe' "$@" |
If we're rewriting this anyway, I'd go with "@safety{.*@mtsafe{".
Is there a potential for @(mt|a[sc])(un)?safe commands to ever change
their syntax? They're defined in manual/macros.texi, so I would assume
they're fixed (enough), but a comment in that file about the regex
dependency in this syntax/correctness-checking script might be a good
idea, if the safety rules aren't defined anywhere else.
> +grep '@mt\(asc\?\|ac\)\?u' |
Generally, I wouldn't think simply checking for this in the output from
above guarantees we found an MT-Unsafe remark inside an MT-Safe
remark---it could be anywhere. However, we should have caught the
phrase improperly placed in the checks above (e.g., in an
A[SC]-(Un)?[Ss]afe remark), so this seems sufficient, fwiw. Similar
rationale applies for subsequent tests.
> +cut -d':' -f1,2)
> +
> +error_msg="@mtsafe tag contains MT-Unsafe remark."
No period [1].
> +check_and_set_error
> # Make sure @mtunsafe lines contain at least one @mtu remark (with
> # optional as, ac or asc between mt and u).
> -grep -n '^@safety.*@mtunsafe' "$@" |
> -grep -v '@mtunsafe{.*@mt\(asc\?\|ac\)\?u' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@mtunsafe' "$@" |
^@safety{.*@mtunsafe{ (braces)
> +grep -v '@mtunsafe{.*@mt\(asc\?\|ac\)\?u' |
> +cut -d':' -f1,2)
> +
> +error_msg="@mtunsafe tag empty."
No period [1].
> +check_and_set_error
>
> # Make sure safety lines marked as @assafe do not contain any AS-Unsafe
> # remark, which could be @asu or @mtasu note (with an optional c
> # between as and u in both cases).
> -grep -n '^@safety.*@assafe' "$@" |
> -grep '@\(mt\)\?asc\?u' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@assafe' "$@" |
^@safety{.*@assafe{ (braces)
> +grep '@\(mt\)\?asc\?u' |
> +cut -d':' -f1,2)
> +
> +error_msg="@assafe tag contains AS-Unsafe remark."
No period [1]. (What do we say when this needs to happen in most, if
not all, cases below?)
> +check_and_set_error
>
> # Make sure @asunsafe lines contain at least one @asu remark (which
> # could be @ascu, or @mtasu or even @mtascu).
> -grep -n '^@safety.*@asunsafe' "$@" |
> -grep -v '@mtasc\?u.*@asunsafe\|@asunsafe{.*@asc\?u' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@asunsafe' "$@" |
> +grep -v '@mtasc\?u.*@asunsafe\|@asunsafe{.*@asc\?u' |
@mtasc\?u.*@asunsafe{\|@asunsafe{.*@asc\?u (brace)
> +cut -d':' -f1,2)
> +
> +error_msg="@asunsafe tag empty."
No period [1].
> +check_and_set_error
>
> # Make sure safety lines marked as @acsafe do not contain any
> # AC-Unsafe remark, which could be @acu, @ascu or even @mtacu or
> # @mtascu.
> -grep -n '^@safety.*@acsafe' "$@" |
> -grep '@\(mt\)\?as\?cu' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@acsafe' "$@" |
^@safety{.*@acsafe{ (braces)
> +grep '@\(mt\)\?as\?cu' |
> +cut -d':' -f1,2)
> +
> +error_msg="@acsafe tag contains AC-Unsafe remark."
No period [1].
> +check_and_set_error
>
> # Make sure @acunsafe lines contain at least one @acu remark (possibly
> # implied by @ascu, @mtacu or @mtascu).
> -grep -n '^@safety.*@acunsafe' "$@" |
> -grep -v '@\(mtas\?\|as\)cu.*@acunsafe\|@acunsafe{.*@acu' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@acunsafe' "$@" |
^@safety{.*@acunsafe{ (braces)
> +grep -v '@\(mtas\?\|as\)cu.*@acunsafe\|@acunsafe{.*@acu' |
> +cut -d':' -f1,2)
> +
> +error_msg="@acunsafe tag empty."
No period [1].
> +check_and_set_error
>
> # Make sure there aren't duplicate remarks in the same safety note.
> -grep -n '^@safety' "$@" |
> -grep '[^:]\(@\(mt\|a[sc]\)[^ {]*{[^ ]*}\).*[^:]\1' &&
> -success=false
> +error_ln=$(grep -n '^@safety' "$@" |
^@safety{
> +grep '[^:]\(@\(mt\|a[sc]\)[^ {]*{[^ ]*}\).*[^:]\1' |
> +cut -d':' -f1,2)
> +
> +error_msg="duplicated remark."
No period [1].
> +check_and_set_error
>
> # Check that comments containing safety remarks do not contain {}s,
> # that all @mt remarks appear before @as remarks, that in turn appear
> # before @ac remarks, all properly blank-separated, and that an
> # optional comment about exclusions is between []s at the end of the
> # line.
> -grep -n '^@c \+[^@ ]\+\( dup\)\?'\
> +error_ln=$(grep -n '^@c \+[^@ ]\+\( dup\)\?'\
> '\( @\(mt\|a[sc]\)[^ ]*\)*\( \[.*\]\)\?$' "$@" |
> grep -v ':@c *[^@{}]*\( @mt[^ {}]*\)*'\
> -'\( @as[^ {}]*\)*\( @ac[^ {}]*\)*\( \[.*\]\)\?$' &&
> -success=false
> +'\( @as[^ {}]*\)*\( @ac[^ {}]*\)*\( \[.*\]\)\?$' |
> +cut -d':' -f1,2)
I don't quite understand why the syntax in comments matters here, but
seeing as how this maintains the status quo, sure.
Comments could technically be "@c(omment)?", though [2].
Also, there are a few lines that look to be comments containing an
@safety remark and {}s, but aren't matched in the above test (grep
'^@c.*@safety{' manual/*.texi).
Other supposed conditions untested.
> +
> +error_msg="safety remark in wrong order (@mt, @as, @ac), no space "\
> +"between remarks or bad format for optional comment."
No period [1]. Also, not sure the test is accurate for either the
comment or message.
> +check_and_set_error
>
> # Check that comments containing safety remarks do not contain
> # duplicate remarks.
> -grep -n '^@c \+[^@ ]\+\( dup\)\?'\
> +error_ln=$(grep -n '^@c \+[^@ ]\+\( dup\)\?'\
> '\( @\(mt\|a[sc]\)[^ ]*\)*\( \[.*\]\)\?$' "$@" |
> -grep '[^:]\(@\(mt\|a[sc]\)[^ ]*\) \(.*[^:]\)\?\1\($\| \)' &&
> -success=false
> +grep '[^:]\(@\(mt\|a[sc]\)[^ ]*\) \(.*[^:]\)\?\1\($\| \)' |
> +cut -d':' -f1,2)
In lieu of the comment-syntax relevancy question, untested.
> +
> +error_msg="duplicated remark in a comment."
No period [1].
> +check_and_set_error
>
> $success
> diff --git a/manual/chk-typefun.awk b/manual/chk-typefun.awk
> new file mode 100644
> index 0000000..c895a69
> --- /dev/null
> +++ b/manual/chk-typefun.awk
> @@ -0,0 +1,46 @@
> +#! /usr/local/bin/gawk -f
Do we really assume gawk is in /usr/local? I see one other script does
(manual/xtract-typefun.awk), and 3 others use /usr/bin/awk, and only
those 4 of 36 *.awk scripts have the shebang.
> +
> +# Copyright 2016 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# Lesser General Public License for more details.
> +
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <http://www.gnu.org/licenses/>.
> +
> +
> +# Checks that every @deftypefun is follown by a @safety tag.
followed
I think "an @safety tag" is correct, because it's read, "followed by an
at-safety tag." (I believe they're called "at-commands".) Again, maybe
too nit-picky, since using "a" implies you've chosen to pronounce
"@safety" as "safety" and not "at-safety".
> +# Displays an error in case it is not found.
> +
> +
> +/^@deftypefun / {
> + # Found deftypefun
> +
> + # Read next line. If they are comments, keep going.
> + getline inp
> + while (match (inp, "^@c(omment)? ") ||
> + match (inp, "^@deftypefunx ") ||
> + match (inp, "^@[cp]index ")) {
> +
> + getline inp
> + }
> +
> + # Done reading comments, it's a @safety tag or
Possibly "an".
> + # we have to report error.
"an error" :)
> + if (!match (inp, "^@safety{")) {
> + printf "%s:%d\n", FILENAME, FNR
> + exit 1
> + }
> +
> + # If we get here is because tags were correctly
> + # placed.
> +}
I like it. If we're putting this in a variable (error_ln) that is
getting prefixed to a message, do we really want/need the trailing newline?
> diff --git a/manual/install.texi b/manual/install.texi
> index de9d270..1e8d323 100644
> --- a/manual/install.texi
> +++ b/manual/install.texi
> @@ -266,7 +266,8 @@ To format the @cite{GNU C Library Reference Manual} for printing, type
> @w{@code{make dvi}}. You need a working @TeX{} installation to do
> this. The distribution builds the on-line formatted version of the
> manual, as Info files, as part of the build process. You can build
> -them manually with @w{@code{make info}}.
> +them manually with @w{@code{make info}}. Moreover, it's possible
> +to get a copy in PDF format. To build it, type @w{@code{make pdf}}.
I appreciate this inclusion, and would like to point out it's possible
to `make html' as well.
Perhaps the `make pdf' mention should be moved closer to dvi, since it
has the same caveat (manual/Makefile):
TEXI2DVI = texi2dvi
TEXI2PDF = texi2dvi --pdf
rj
----
[1] https://www.gnu.org/prep/standards/html_node/Errors.html
[2]
https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Comments.html