This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [PATCH v2] Improve debugging for manual safety annotations

From: ricaljasan <ricaljasan at pacific dot net>
To: Juan Manuel Torres Palma <j dot m dot torrespalma at gmail dot com>
Cc: aoliva at redhat dot com, libc-alpha at sourceware dot org
Date: Thu, 10 Mar 2016 01:48:49 -0800
Subject: Re: [PATCH v2] Improve debugging for manual safety annotations
Authentication-results: sourceware.org; auth=none
References: <20160309231618 dot GA29413 at randy-betty dot homestation>
On 03/09/2016 03:16 PM, Juan Manuel Torres Palma wrote:
> hello glibc folks,
> 
> I have submitted and pinged this patch for quite a while already but I get no
> feedback on it.

I'll try to help.

> diff --git a/manual/check-safety.sh b/manual/check-safety.sh
> index 2eba000..a3e08fe 100644
> --- a/manual/check-safety.sh
> +++ b/manual/check-safety.sh
> @@ -24,23 +24,48 @@
>  # an explicit reason and when there's a reason for unsafety it's not
>  # safe, and that there aren't duplicates remarks.
>  
> -
> +# Set to ":" if no error was found, and to "false" if found.
>  success=:

Really, initialized to ":", but maybe that's too nit-picky.  I would say
something like, 'Initialize to ":", set to "false" on error.'

> +# Gets the name of the file and line where an error was found.
> +error_ln=

Maybe "holds" here as well?  Would be consistent with below, and more
correct.

> +
> +# Holds the error message for an error.
> +error_msg=
> +
>  # If no arguments are given, take all *.texi files in the current directory.
>  test $# != 0 || set *.texi

> -# FIXME: check that each @deftypefu?n is followed by a @safety note,
> -# with nothing but @deftypefu?nx and comment lines in between.  (There
> -# might be more stuff too).
> +
> +# Function to check errors and set $success.
> +check_and_set_error ()
> +{
> +  if [ -n "$error_ln" ]
> +  then
> +    echo "$error_ln:Error $error_msg"

Should be "$error_ln: $error_msg".  Needs a space after the colon and
"Error" wouldn't be capitalized in any case, but simply using the
uncapitalized message below is correct, per GNU Coding Standards [1].

> +    success=false
> +  fi
> +}
> +
> +
> +# Check that each @deftypefu?n is followed by a @safety note,
> +# with nothing but @deftypefu?nx and comment lines in between.
> +# Also indexes are allowed.
> +error_ln=$(awk -f chk-typefun.awk "$@")
> +error_msg="unexpected tag between @deftypefun and @safety."

No period at end of message [1].

> +check_and_set_error
>  
>  
>  # Check that all safety remarks have entries for all of MT, AS and AC,
>  # in this order, with an optional prelim note before them.
> -grep -n '^@safety' "$@" |
> +error_ln=$(grep -n '^@safety' "$@" |

In the awk script, you matched on "^@safety{", which seemed like a wise
choice.

>  grep -v ':@safety{\(@prelim{}\)\?@mt\(un\)\?safe{.*}'\
> -'@as\(un\)\?safe{.*}@ac\(un\)\?safe{.*}}' &&
> -success=false
> +'@as\(un\)\?safe{.*}@ac\(un\)\?safe{.*}}' |

I wonder if there would be any benefit to being stricter here, and
anchoring the end.  Nothing in manual/*.texi violates it at the moment,
though.

Also, there aren't any preliminary comments anywhere in the manual that
would currently require matching @prelim{.*}, but I've always wondered
about it.  manual/macros.texi adds an ugly, useless colon (useless
because it ignores any comment, which means the colon is always
immediately followed by a pipe), but it is still technically possible to
provide a comment in the @prelim macro even though it won't be rendered,
so it is valid to have "@prelim{foo}" somewhere.

> +cut -d':' -f1,2)
> +
> +error_msg="safety marks are in incorrect order (MT, AS, AC)."

Or something was absent, etc.  I would just say, "Invalid @safety
command", because the test above is pretty broad.

Regardless, no period [1].

> +check_and_set_error
> +

>  # Check that @mt-started notes appear within @mtsafe or @mtunsafe,
>  # that @as-started notes appear within @assafe or @asunsafe, and that
> @@ -49,76 +74,108 @@ success=false
>  # unsafe), but let @mt have as, ac or asc before [su], and let @as
>  # have a c (for cancel) before [su].  Also make sure blanks separate
>  # each of the annotations.
> -grep -n '^@safety' "$@" |
> +error_ln=$(grep -n '^@safety' "$@" |

I like "^@safety{".

>  grep -v ':@safety{\(@prelim{}\)\?'\

Same caveat for @prelim.

>  '@mt\(un\)\?safe{\(@mt\(asc\?\|ac\)\?[su][^ ]*}\)\?'\
>  '\( @mt\(asc\?\|ac\)\?[su][^ ]*}\)*}'\
>  '@as\(un\)\?safe{\(@asc\?[su][^ ]*}\)\?'\
>  '\( @asc\?[su][^ ]*}\)*}'\
>  '@ac\(un\)\?safe{\(@ac[su][^ ]*}\)\?'\
> -'\( @ac[su][^ ]*}\)*}}' &&
> -success=false
> +'\( @ac[su][^ ]*}\)*}}' |
> +cut -d':' -f1,2)

For the record, I did analyse this, despite it being unchanged, and it
looks reasonable, and nothing currently violates it.

> +
> +error_msg="tags are uncorrectly set. Check that every "\

incorrectly

> +"remark is placed in the right tag."

[1] doesn't explicitly address multiple phrases/sentences in the
message, but I'd assume "Check" is correct here, in which case both
periods would be correct, because there is a conceptual sentence
(although [1] seems adamant about not ending with a period).  Thoughts
otherwise?

> +check_and_set_error
>  
>  # Make sure safety lines marked as @mtsafe do not contain any
>  # MT-Unsafe remark; that would be @mtu, but there could be as, ac or
>  # asc between mt and u.
> -grep -n '^@safety.*@mtsafe' "$@" |
> -grep '@mt\(asc\?\|ac\)?u' "$@" &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@mtsafe' "$@" |

If we're rewriting this anyway, I'd go with "@safety{.*@mtsafe{".

Is there a potential for @(mt|a[sc])(un)?safe commands to ever change
their syntax? They're defined in manual/macros.texi, so I would assume
they're fixed (enough), but a comment in that file about the regex
dependency in this syntax/correctness-checking script might be a good
idea, if the safety rules aren't defined anywhere else.

> +grep '@mt\(asc\?\|ac\)\?u' |

Generally, I wouldn't think simply checking for this in the output from
above guarantees we found an MT-Unsafe remark inside an MT-Safe
remark---it could be anywhere.  However, we should have caught the
phrase improperly placed in the checks above (e.g., in an
A[SC]-(Un)?[Ss]afe remark), so this seems sufficient, fwiw.  Similar
rationale applies for subsequent tests.

> +cut -d':' -f1,2)
> +
> +error_msg="@mtsafe tag contains MT-Unsafe remark."

No period [1].

> +check_and_set_error

>  # Make sure @mtunsafe lines contain at least one @mtu remark (with
>  # optional as, ac or asc between mt and u).
> -grep -n '^@safety.*@mtunsafe' "$@" |
> -grep -v '@mtunsafe{.*@mt\(asc\?\|ac\)\?u' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@mtunsafe' "$@" |

^@safety{.*@mtunsafe{ (braces)

> +grep -v '@mtunsafe{.*@mt\(asc\?\|ac\)\?u' |
> +cut -d':' -f1,2)
> +
> +error_msg="@mtunsafe tag empty."

No period [1].

> +check_and_set_error
>  
>  # Make sure safety lines marked as @assafe do not contain any AS-Unsafe
>  # remark, which could be @asu or @mtasu note (with an optional c
>  # between as and u in both cases).
> -grep -n '^@safety.*@assafe' "$@" |
> -grep '@\(mt\)\?asc\?u' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@assafe' "$@" |

^@safety{.*@assafe{  (braces)

> +grep '@\(mt\)\?asc\?u' |
> +cut -d':' -f1,2)
> +
> +error_msg="@assafe tag contains AS-Unsafe remark."

No period [1].  (What do we say when this needs to happen in most, if
not all, cases below?)

> +check_and_set_error
>  
>  # Make sure @asunsafe lines contain at least one @asu remark (which
>  # could be @ascu, or @mtasu or even @mtascu).
> -grep -n '^@safety.*@asunsafe' "$@" |
> -grep -v '@mtasc\?u.*@asunsafe\|@asunsafe{.*@asc\?u' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@asunsafe' "$@" |
> +grep -v '@mtasc\?u.*@asunsafe\|@asunsafe{.*@asc\?u' |

@mtasc\?u.*@asunsafe{\|@asunsafe{.*@asc\?u  (brace)

> +cut -d':' -f1,2)
> +
> +error_msg="@asunsafe tag empty."

No period [1].

> +check_and_set_error
>  
>  # Make sure safety lines marked as @acsafe do not contain any
>  # AC-Unsafe remark, which could be @acu, @ascu or even @mtacu or
>  # @mtascu.
> -grep -n '^@safety.*@acsafe' "$@" |
> -grep '@\(mt\)\?as\?cu' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@acsafe' "$@" |

^@safety{.*@acsafe{  (braces)

> +grep '@\(mt\)\?as\?cu' |
> +cut -d':' -f1,2)
> +
> +error_msg="@acsafe tag contains AC-Unsafe remark."

No period [1].

> +check_and_set_error
>  
>  # Make sure @acunsafe lines contain at least one @acu remark (possibly
>  # implied by @ascu, @mtacu or @mtascu).
> -grep -n '^@safety.*@acunsafe' "$@" |
> -grep -v '@\(mtas\?\|as\)cu.*@acunsafe\|@acunsafe{.*@acu' &&
> -success=false
> +error_ln=$(grep -n '^@safety.*@acunsafe' "$@" |

^@safety{.*@acunsafe{  (braces)

> +grep -v '@\(mtas\?\|as\)cu.*@acunsafe\|@acunsafe{.*@acu' |
> +cut -d':' -f1,2)
> +
> +error_msg="@acunsafe tag empty."

No period [1].

> +check_and_set_error
>  
>  # Make sure there aren't duplicate remarks in the same safety note.
> -grep -n '^@safety' "$@" |
> -grep '[^:]\(@\(mt\|a[sc]\)[^ {]*{[^ ]*}\).*[^:]\1' &&
> -success=false
> +error_ln=$(grep -n '^@safety' "$@" |

^@safety{

> +grep '[^:]\(@\(mt\|a[sc]\)[^ {]*{[^ ]*}\).*[^:]\1' |
> +cut -d':' -f1,2)
> +
> +error_msg="duplicated remark."

No period [1].

> +check_and_set_error
>  
>  # Check that comments containing safety remarks do not contain {}s,
>  # that all @mt remarks appear before @as remarks, that in turn appear
>  # before @ac remarks, all properly blank-separated, and that an
>  # optional comment about exclusions is between []s at the end of the
>  # line.
> -grep -n '^@c \+[^@ ]\+\( dup\)\?'\
> +error_ln=$(grep -n '^@c \+[^@ ]\+\( dup\)\?'\
>  '\( @\(mt\|a[sc]\)[^ ]*\)*\( \[.*\]\)\?$' "$@" |
>  grep -v ':@c *[^@{}]*\( @mt[^ {}]*\)*'\
> -'\( @as[^ {}]*\)*\( @ac[^ {}]*\)*\( \[.*\]\)\?$' &&
> -success=false
> +'\( @as[^ {}]*\)*\( @ac[^ {}]*\)*\( \[.*\]\)\?$' |
> +cut -d':' -f1,2)

I don't quite understand why the syntax in comments matters here, but
seeing as how this maintains the status quo, sure.

Comments could technically be "@c(omment)?", though [2].

Also, there are a few lines that look to be comments containing an
@safety remark and {}s, but aren't matched in the above test (grep
'^@c.*@safety{' manual/*.texi).

Other supposed conditions untested.

> +
> +error_msg="safety remark in wrong order (@mt, @as, @ac), no space "\
> +"between remarks or bad format for optional comment."

No period [1].  Also, not sure the test is accurate for either the
comment or message.

> +check_and_set_error
>  
>  # Check that comments containing safety remarks do not contain
>  # duplicate remarks.
> -grep -n '^@c \+[^@ ]\+\( dup\)\?'\
> +error_ln=$(grep -n '^@c \+[^@ ]\+\( dup\)\?'\
>  '\( @\(mt\|a[sc]\)[^ ]*\)*\( \[.*\]\)\?$' "$@" |
> -grep '[^:]\(@\(mt\|a[sc]\)[^ ]*\) \(.*[^:]\)\?\1\($\| \)' &&
> -success=false
> +grep '[^:]\(@\(mt\|a[sc]\)[^ ]*\) \(.*[^:]\)\?\1\($\| \)' |
> +cut -d':' -f1,2)

In lieu of the comment-syntax relevancy question, untested.

> +
> +error_msg="duplicated remark in a comment."

No period [1].

> +check_and_set_error
>  
>  $success

> diff --git a/manual/chk-typefun.awk b/manual/chk-typefun.awk
> new file mode 100644
> index 0000000..c895a69
> --- /dev/null
> +++ b/manual/chk-typefun.awk
> @@ -0,0 +1,46 @@
> +#! /usr/local/bin/gawk -f

Do we really assume gawk is in /usr/local?  I see one other script does
(manual/xtract-typefun.awk), and 3 others use /usr/bin/awk, and only
those 4 of 36 *.awk scripts have the shebang.

> +
> +# Copyright 2016 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <http://www.gnu.org/licenses/>.
> +
> +
> +# Checks that every @deftypefun is follown by a @safety tag.

followed

I think "an @safety tag" is correct, because it's read, "followed by an
at-safety tag."  (I believe they're called "at-commands".)  Again, maybe
too nit-picky, since using "a" implies you've chosen to pronounce
"@safety" as "safety" and not "at-safety".

> +# Displays an error in case it is not found.
> +
> +
> +/^@deftypefun / {
> +  # Found deftypefun
> +
> +  # Read next line. If they are comments, keep going.
> +  getline inp
> +  while (match (inp, "^@c(omment)? ") ||
> +         match (inp, "^@deftypefunx ") ||
> +         match (inp, "^@[cp]index ")) {
> +
> +    getline inp
> +  }
> +
> +  # Done reading comments, it's a @safety tag or

Possibly "an".

> +  # we have to report error.

"an error"  :)

> +  if (!match (inp, "^@safety{")) {
> +    printf "%s:%d\n", FILENAME, FNR
> +    exit 1
> +  }
> +
> +  # If we get here is because tags were correctly
> +  # placed.
> +}

I like it.  If we're putting this in a variable (error_ln) that is
getting prefixed to a message, do we really want/need the trailing newline?

> diff --git a/manual/install.texi b/manual/install.texi
> index de9d270..1e8d323 100644
> --- a/manual/install.texi
> +++ b/manual/install.texi
> @@ -266,7 +266,8 @@ To format the @cite{GNU C Library Reference Manual} for printing, type
>  @w{@code{make dvi}}.  You need a working @TeX{} installation to do
>  this.  The distribution builds the on-line formatted version of the
>  manual, as Info files, as part of the build process.  You can build
> -them manually with @w{@code{make info}}.
> +them manually with @w{@code{make info}}. Moreover, it's possible
> +to get a copy in PDF format. To build it, type @w{@code{make pdf}}.

I appreciate this inclusion, and would like to point out it's possible
to `make html' as well.

Perhaps the `make pdf' mention should be moved closer to dvi, since it
has the same caveat (manual/Makefile):

  TEXI2DVI = texi2dvi
  TEXI2PDF = texi2dvi --pdf


rj

----
[1] https://www.gnu.org/prep/standards/html_node/Errors.html
[2]
https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Comments.html
Follow-Ups:
- Re: [PATCH v2] Improve debugging for manual safety annotations
  - From: Juan Manuel Torres Palma
References:
- [PATCH v2] Improve debugging for manual safety annotations
  - From: Juan Manuel Torres Palma
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]