This is the mail archive of the glibc-cvs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

GNU C Library master sources branch master updated. glibc-2.24-168-gf4a3654


This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  f4a36548d86453792e3db05898f6f2b732c32581 (commit)
      from  85f7554cd97e7f03d8dc66278653045ef63a2221 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
http://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=f4a36548d86453792e3db05898f6f2b732c32581

commit f4a36548d86453792e3db05898f6f2b732c32581
Author: Florian Weimer <fweimer@redhat.com>
Date:   Wed Sep 21 15:41:17 2016 +0200

    manual: Clarify the documentation of strverscmp [BZ #20524]

diff --git a/ChangeLog b/ChangeLog
index 825d4f4..b1a6692 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,11 @@
 2016-09-21  Florian Weimer  <fweimer@redhat.com>
 
+	[BZ #20524]
+	* manual/string.texi (String/Array Comparison): Clarify the
+	strverscmp behavior.
+
+2016-09-21  Florian Weimer  <fweimer@redhat.com>
+
 	* test-skeleton.c (xasprintf): Add function.
 	* io/tst-open-tmpfile.c: New test.
 	* io/Makefile (tests): Add it.
diff --git a/manual/string.texi b/manual/string.texi
index bce81a7..1986357 100644
--- a/manual/string.texi
+++ b/manual/string.texi
@@ -1374,46 +1374,75 @@ The @code{strverscmp} function compares the string @var{s1} against
 @var{s2}, considering them as holding indices/version numbers.  The
 return value follows the same conventions as found in the
 @code{strcmp} function.  In fact, if @var{s1} and @var{s2} contain no
-digits, @code{strverscmp} behaves like @code{strcmp}.
+digits, @code{strverscmp} behaves like @code{strcmp}
+(in the sense that the sign of the result is the same).
 
-Basically, we compare strings normally (byte by byte), until
-we find a digit in each string - then we enter a special comparison
-mode, where each sequence of digits is taken as a whole.  If we reach the
-end of these two parts without noticing a difference, we return to the
-standard comparison mode.  There are two types of numeric parts:
-"integral" and "fractional" (those  begin with a '0').  The types
-of the numeric parts affect the way we sort them:
+The comparison algorithm which the @code{strverscmp} function implements
+differs slightly from other version-comparison algorithms.  The
+implementation is based on a finite-state machine, whose behavior is
+approximated below.
 
 @itemize @bullet
 @item
-integral/integral: we compare values as you would expect.
+The input strings are each split into sequences of non-digits and
+digits.  These sequences can be empty at the beginning and end of the
+string.  Digits are determined by the @code{isdigit} function and are
+thus subject to the current locale.
 
 @item
-fractional/integral: the fractional part is less than the integral one.
-Again, no surprise.
+Comparison starts with a (possibly empty) non-digit sequence.  The first
+non-equal sequences of non-digits or digits determines the outcome of
+the comparison.
 
 @item
-fractional/fractional: the things become a bit more complex.
-If the common prefix contains only leading zeroes, the longest part is less
-than the other one; else the comparison behaves normally.
+Corresponding non-digit sequences in both strings are compared
+lexicographically if their lengths are equal.  If the lengths differ,
+the shorter non-digit sequence is extended with the input string
+character immediately following it (which may be the null terminator),
+the other sequence is truncated to be of the same (extended) length, and
+these two sequences are compared lexicographically.  In the last case,
+the sequence comparison determines the result of the function because
+the extension character (or some character before it) is necessarily
+different from the character at the same offset in the other input
+string.
+
+@item
+For two sequences of digits, the number of leading zeros is counted (which
+can be zero).  If the count differs, the string with more leading zeros
+in the digit sequence is considered smaller than the other string.
+
+@item
+If the two sequences of digits have no leading zeros, they are compared
+as integers, that is, the string with the longer digit sequence is
+deemed larger, and if both sequences are of equal length, they are
+compared lexicographically.
+
+@item
+If both digit sequences start with a zero and have an equal number of
+leading zeros, they are compared lexicographically if their lengths are
+the same.  If the lengths differ, the shorter sequence is extended with
+the following character in its input string, and the other sequence is
+truncated to the same length, and both sequences are compared
+lexicographically (similar to the non-digit sequence case above).
 @end itemize
 
+The treatment of leading zeros and the tie-breaking extension characters
+(which in effect propagate across non-digit/digit sequence boundaries)
+differs from other version-comparison algorithms.
+
 @smallexample
 strverscmp ("no digit", "no digit")
     @result{} 0    /* @r{same behavior as strcmp.} */
 strverscmp ("item#99", "item#100")
     @result{} <0   /* @r{same prefix, but 99 < 100.} */
 strverscmp ("alpha1", "alpha001")
-    @result{} >0   /* @r{fractional part inferior to integral one.} */
+    @result{} >0   /* @r{different number of leading zeros (0 and 2).} */
 strverscmp ("part1_f012", "part1_f01")
-    @result{} >0   /* @r{two fractional parts.} */
+    @result{} >0   /* @r{lexicographical comparison with leading zeros.} */
 strverscmp ("foo.009", "foo.0")
-    @result{} <0   /* @r{idem, but with leading zeroes only.} */
+    @result{} <0   /* @r{different number of leading zeros (2 and 1).} */
 @end smallexample
 
-This function is especially useful when dealing with filename sorting,
-because filenames frequently hold indices/version numbers.
-
 @code{strverscmp} is a GNU extension.
 @end deftypefun
 

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog          |    6 ++++
 manual/string.texi |   69 +++++++++++++++++++++++++++++++++++++---------------
 2 files changed, 55 insertions(+), 20 deletions(-)


hooks/post-receive
-- 
GNU C Library master sources


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]