This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Implement strlcat [BZ#178]


On 12/04/2015 05:41 AM, Florian Weimer wrote:
Looks good.

Thanks, I installed the first patch. Revised versions of the other two patches are attached. Responding to your comments:

_FORTIFY_SOURCE mostly covers the
write-to-statically-sized-buffer case.  The manual should not make
promises the current GCC/glibc combinations cannot deliver.
_FORTIFY_SOURCE will always be brittle for the more complex cases
because of the dependency on GCC optimization behavior.

It's also highly application-specific whether a crash (induced by
_FORTIFY_SOURCE) or truncation (from strncpy or strlcpy) is better.


Thanks, good points. I addressed them in the attached patches by replacing that paragraph with the following text.

Although some buffer overruns can be prevented by manually replacing
calls to copying functions with calls to truncation functions,
nowadays there are easier and more-reliable automatic techniques that
cause buffer overruns to reliably terminate a program.  These include
GCC's @option{-fsanitize=address} option and, if the destination
buffer is statically sized, defining the @code{_FORTIFY_SOURCE} macro.
Because truncation functions can mask application bugs that would
otherwise be caught by the automatic techniques, these functions
should be used only when the application's underlying logic requires
truncation.


Or perhaps you'd rather not document _FORTIFY_SOURCE at all? I notice it's mentioned nowhere in the manual; is that intended? If so, I can further revise accordingly.

The attached 2nd patch also alter the proposed strlcpy+strlcat documentation as per my more-recent emails. One more thing: the attached 1st patch also removes the strncat example, as I discovered that it's quite misleading and the manual shouldn't be pushing strncat anyway.
>From 8399e174f6e87a75007e1e5e173eae7e75a3bf3a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 4 Dec 2015 09:25:00 -0800
Subject: [PATCH 1/2] Split large string section; add truncation advice

* manual/examples/strncat.c: Remove.
This example was misleading, as the code would have undefined
behavior if "hello" was longer than SIZE.  Anyway, the manual
shouldn't encourage strncpy+strncat for this sort of thing.
* manual/string.texi (Copying Strings and Arrays): Split into
three sections Copying Strings and Arrays, Concatenating Strings,
and Truncating Strings, as this section was way too long.  All
cross-referenced changed.  Add advice about string-truncation
functions.  Remove misleading strncat example.
---
 ChangeLog                 |  11 ++
 manual/examples/strncat.c |  32 ----
 manual/lang.texi          |   2 +-
 manual/locale.texi        |   4 +-
 manual/memory.texi        |   2 +-
 manual/stdio.texi         |   2 +-
 manual/string.texi        | 438 ++++++++++++++++++++++++++--------------------
 7 files changed, 263 insertions(+), 228 deletions(-)
 delete mode 100644 manual/examples/strncat.c

diff --git a/ChangeLog b/ChangeLog
index 266df03..083a08d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,16 @@
 2015-12-04  Paul Eggert  <eggert@cs.ucla.edu>
 
+	Split large string section; add truncation advice
+	* manual/examples/strncat.c: Remove.
+	This example was misleading, as the code would have undefined
+	behavior if "hello" was longer than SIZE.  Anyway, the manual
+	shouldn't encourage strncpy+strncat for this sort of thing.
+	* manual/string.texi (Copying Strings and Arrays): Split into
+	three sections Copying Strings and Arrays, Concatenating Strings,
+	and Truncating Strings, as this section was way too long.  All
+	cross-referenced changed.  Add advice about string-truncation
+	functions.  Remove misleading strncat example.
+
 	Consistency about byte vs character in string.texi
 	* manual/string.texi (String and Array Utilities):
 	Distinguish more carefully among bytes, multibyte characters,
diff --git a/manual/examples/strncat.c b/manual/examples/strncat.c
deleted file mode 100644
index 509be49..0000000
--- a/manual/examples/strncat.c
+++ /dev/null
@@ -1,32 +0,0 @@
-/* strncat example.
-   Copyright (C) 1991-2015 Free Software Foundation, Inc.
-
-   This program is free software; you can redistribute it and/or
-   modify it under the terms of the GNU General Public License
-   as published by the Free Software Foundation; either version 2
-   of the License, or (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program; if not, if not, see <http://www.gnu.org/licenses/>.
-*/
-
-#include <string.h>
-#include <stdio.h>
-
-#define SIZE 10
-
-static char buffer[SIZE];
-
-int
-main (void)
-{
-  strncpy (buffer, "hello", SIZE);
-  puts (buffer);
-  strncat (buffer, ", world", SIZE - strlen (buffer) - 1);
-  puts (buffer);
-}
diff --git a/manual/lang.texi b/manual/lang.texi
index 28b21cb..7f8a368 100644
--- a/manual/lang.texi
+++ b/manual/lang.texi
@@ -582,7 +582,7 @@ type that exists only for this purpose.
 This is an unsigned integer type used to represent the sizes of objects.
 The result of the @code{sizeof} operator is of this type, and functions
 such as @code{malloc} (@pxref{Unconstrained Allocation}) and
-@code{memcpy} (@pxref{Copying and Concatenation}) accept arguments of
+@code{memcpy} (@pxref{Copying Strings and Arrays}) accept arguments of
 this type to specify object sizes.  On systems using @theglibc{}, this
 will be @w{@code{unsigned int}} or @w{@code{unsigned long int}}.
 
diff --git a/manual/locale.texi b/manual/locale.texi
index ee1c3a1..1828500 100644
--- a/manual/locale.texi
+++ b/manual/locale.texi
@@ -374,8 +374,8 @@ a null pointer as the @var{locale} argument.  In this case,
 currently selected for category @var{category}.
 
 The string returned by @code{setlocale} can be overwritten by subsequent
-calls, so you should make a copy of the string (@pxref{Copying and
-Concatenation}) if you want to save it past any further calls to
+calls, so you should make a copy of the string (@pxref{Copying Strings
+and Arrays}) if you want to save it past any further calls to
 @code{setlocale}.  (The standard library is guaranteed never to call
 @code{setlocale} itself.)
 
diff --git a/manual/memory.texi b/manual/memory.texi
index cea2cd7..700555e 100644
--- a/manual/memory.texi
+++ b/manual/memory.texi
@@ -547,7 +547,7 @@ The contents of the block are undefined; you must initialize it yourself
 Normally you would cast the value as a pointer to the kind of object
 that you want to store in the block.  Here we show an example of doing
 so, and of initializing the space with zeros using the library function
-@code{memset} (@pxref{Copying and Concatenation}):
+@code{memset} (@pxref{Copying Strings and Arrays}):
 
 @smallexample
 struct foo *ptr;
diff --git a/manual/stdio.texi b/manual/stdio.texi
index c0753b1..0326f29 100644
--- a/manual/stdio.texi
+++ b/manual/stdio.texi
@@ -2428,7 +2428,7 @@ the array @var{s}, not including the terminating null character.
 The behavior of this function is undefined if copying takes place
 between objects that overlap---for example, if @var{s} is also given
 as an argument to be printed under control of the @samp{%s} conversion.
-@xref{Copying and Concatenation}.
+@xref{Copying Strings and Arrays}.
 
 @strong{Warning:} The @code{sprintf} function can be @strong{dangerous}
 because it can potentially output more characters than can fit in the
diff --git a/manual/string.texi b/manual/string.texi
index 4f276a9..5b11b3b 100644
--- a/manual/string.texi
+++ b/manual/string.texi
@@ -25,8 +25,9 @@ too.
 * String/Array Conventions::    Whether to use a string function or an
 				 arbitrary array function.
 * String Length::               Determining the length of a string.
-* Copying and Concatenation::   Functions to copy the contents of strings
-				 and arrays.
+* Copying Strings and Arrays::  Functions to copy strings and arrays.
+* Concatenating Strings::       Functions to concatenate strings while copying.
+* Truncating Strings::          Functions to truncate strings while copying.
 * String/Array Comparison::     Functions for byte-wise and character-wise
 				 comparison.
 * Collation Functions::         Functions for collating strings.
@@ -341,14 +342,13 @@ This function is a GNU extension and is declared in @file{string.h}.
 This function is a GNU extension and is declared in @file{wchar.h}.
 @end deftypefun
 
-@node Copying and Concatenation
-@section Copying and Concatenation
+@node Copying Strings and Arrays
+@section Copying Strings and Arrays
 
 You can use the functions described in this section to copy the contents
-of strings and arrays, or to append the contents of one string to
-another.  The @samp{str} and @samp{mem} functions are declared in the
-header file @file{string.h} while the @samp{wstr} and @samp{wmem}
-functions are declared in the file @file{wchar.h}.
+of strings, wide strings, and arrays.  The @samp{str} and @samp{mem}
+functions are declared in @file{string.h} while the @samp{w} functions
+are declared in @file{wchar.h}.
 @pindex string.h
 @pindex wchar.h
 @cindex copying strings and arrays
@@ -359,8 +359,10 @@ functions are declared in the file @file{wchar.h}.
 
 A helpful way to remember the ordering of the arguments to the functions
 in this section is that it corresponds to an assignment expression, with
-the destination array specified to the left of the source array.  All
-of these functions return the address of the destination array.
+the destination array specified to the left of the source array.  Most
+of these functions return the address of the destination array; a few
+return the address of the destination's terminating null, or of just
+past the destination.
 
 Most of these functions do not work properly if the source and
 destination arrays overlap.  For example, if the beginning of the
@@ -572,63 +574,6 @@ including the terminating null wide character) into the string
 the strings overlap.  The return value is the value of @var{wto}.
 @end deftypefun
 
-@comment string.h
-@comment ISO
-@deftypefun {char *} strncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
-@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
-This function is similar to @code{strcpy} but always copies exactly
-@var{size} bytes into @var{to}.
-
-If @var{from} does not contain a null byte in its first @var{size}
-bytes, @code{strncpy} copies just the first @var{size} bytes.  In this
-case no null terminator is written into @var{to}.
-
-Otherwise @var{from} must be a string with length less than
-@var{size}.  In this case @code{strncpy} copies all of @var{from},
-followed by enough null bytes to add up to @var{size} bytes in all.
-This behavior is rarely useful, but it
-is specified by the @w{ISO C} standard.
-
-The behavior of @code{strncpy} is undefined if the strings overlap.
-
-Using @code{strncpy} as opposed to @code{strcpy} is a way to avoid bugs
-relating to writing past the end of the allocated space for @var{to}.
-However, it can also make your program much slower in one common case:
-copying a string which is probably small into a potentially large buffer.
-In this case, @var{size} may be large, and when it is, @code{strncpy} will
-waste a considerable amount of time copying null bytes.
-@end deftypefun
-
-@comment wchar.h
-@comment ISO
-@deftypefun {wchar_t *} wcsncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
-@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
-This function is similar to @code{wcscpy} but always copies exactly
-@var{size} wide characters into @var{wto}.
-
-If @var{wfrom} does not contain a null wide character in its first
-@var{size} wide characters, then @code{wcsncpy} copies just the first
-@var{size} wide characters.  In this case no null terminator is
-written into @var{wto}.
-
-Otherwise @var{wfrom} must be a wide string with length less than
-@var{size}.  In this case @code{wcsncpy} copies all of @var{wfrom},
-followed by enough null wide
-characters to add up to @var{size} wide characters in all.  This
-behavior is rarely useful, but it is specified by the @w{ISO C}
-standard.
-
-The behavior of @code{wcsncpy} is undefined if the strings overlap.
-
-Using @code{wcsncpy} as opposed to @code{wcscpy} is a way to avoid bugs
-relating to writing past the end of the allocated space for @var{wto}.
-However, it can also make your program much slower in one common case:
-copying a string which is probably small into a potentially large buffer.
-In this case, @var{size} may be large, and when it is, @code{wcsncpy} will
-waste a considerable amount of time copying null wide characters.
-@end deftypefun
-
-@comment string.h
 @comment SVID
 @deftypefun {char *} strdup (const char *@var{s})
 @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
@@ -653,24 +598,6 @@ This function is a GNU extension.
 @end deftypefun
 
 @comment string.h
-@comment GNU
-@deftypefun {char *} strndup (const char *@var{s}, size_t @var{size})
-@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
-This function is similar to @code{strdup} but always copies at most
-@var{size} bytes into the newly allocated string.
-
-If the length of @var{s} is more than @var{size}, then @code{strndup}
-copies just the first @var{size} bytes and adds a closing null
-byte.  Otherwise all bytes are copied and the string is
-terminated.
-
-This function is different to @code{strncpy} in that it always
-terminates the destination string.
-
-@code{strndup} is a GNU extension.
-@end deftypefun
-
-@comment string.h
 @comment Unknown origin
 @deftypefun {char *} stpcpy (char *restrict @var{to}, const char *restrict @var{from})
 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
@@ -711,60 +638,6 @@ The behavior of @code{wcpcpy} is undefined if the strings overlap.
 
 @comment string.h
 @comment GNU
-@deftypefun {char *} stpncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
-@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
-This function is similar to @code{stpcpy} but copies always exactly
-@var{size} bytes into @var{to}.
-
-If the length of @var{from} is more than @var{size}, then @code{stpncpy}
-copies just the first @var{size} bytes and returns a pointer to the
-byte directly following the one which was copied last.  Note that in
-this case there is no null terminator written into @var{to}.
-
-If the length of @var{from} is less than @var{size}, then @code{stpncpy}
-copies all of @var{from}, followed by enough null bytes to add up
-to @var{size} bytes in all.  This behavior is rarely useful, but it
-is implemented to be useful in contexts where this behavior of the
-@code{strncpy} is used.  @code{stpncpy} returns a pointer to the
-@emph{first} written null byte.
-
-This function is not part of ISO or POSIX but was found useful while
-developing @theglibc{} itself.
-
-Its behavior is undefined if the strings overlap.  The function is
-declared in @file{string.h}.
-@end deftypefun
-
-@comment wchar.h
-@comment GNU
-@deftypefun {wchar_t *} wcpncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
-@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
-This function is similar to @code{wcpcpy} but copies always exactly
-@var{wsize} wide characters into @var{wto}.
-
-If the length of @var{wfrom} is more than @var{size}, then
-@code{wcpncpy} copies just the first @var{size} wide characters and
-returns a pointer to the wide character directly following the last
-non-null wide character which was copied last.  Note that in this case
-there is no null terminator written into @var{wto}.
-
-If the length of @var{wfrom} is less than @var{size}, then @code{wcpncpy}
-copies all of @var{wfrom}, followed by enough null wide characters to add up
-to @var{size} wide characters in all.  This behavior is rarely useful, but it
-is implemented to be useful in contexts where this behavior of the
-@code{wcsncpy} is used.  @code{wcpncpy} returns a pointer to the
-@emph{first} written null wide character.
-
-This function is not part of ISO or POSIX but was found useful while
-developing @theglibc{} itself.
-
-Its behavior is undefined if the strings overlap.
-
-@code{wcpncpy} is a GNU extension and is declared in @file{wchar.h}.
-@end deftypefun
-
-@comment string.h
-@comment GNU
 @deftypefn {Macro} {char *} strdupa (const char *@var{s})
 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 This macro is similar to @code{strdup} but allocates the new string
@@ -791,20 +664,35 @@ This function is only available if GNU CC is used.
 @end deftypefn
 
 @comment string.h
-@comment GNU
-@deftypefn {Macro} {char *} strndupa (const char *@var{s}, size_t @var{size})
+@comment BSD
+@deftypefun void bcopy (const void *@var{from}, void *@var{to}, size_t @var{size})
 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
-This function is similar to @code{strndup} but like @code{strdupa} it
-allocates the new string using @code{alloca}
-@pxref{Variable Size Automatic}.  The same advantages and limitations
-of @code{strdupa} are valid for @code{strndupa}, too.
+This is a partially obsolete alternative for @code{memmove}, derived from
+BSD.  Note that it is not quite equivalent to @code{memmove}, because the
+arguments are not in the same order and there is no return value.
+@end deftypefun
 
-This function is implemented only as a macro, just like @code{strdupa}.
-Just as @code{strdupa} this macro also must not be used inside the
-parameter list in a function call.
+@comment string.h
+@comment BSD
+@deftypefun void bzero (void *@var{block}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This is a partially obsolete alternative for @code{memset}, derived from
+BSD.  Note that it is not as general as @code{memset}, because the only
+value it can store is zero.
+@end deftypefun
 
-@code{strndupa} is only available if GNU CC is used.
-@end deftypefn
+@node Concatenating Strings
+@section Concatenating Strings
+@pindex string.h
+@pindex wchar.h
+@cindex concatenating strings
+@cindex string concatenation functions
+
+The functions described in this section concatenate the contents of a
+string or wide string to another.  They follow the string-copying
+functions in their conventions.  @xref{Copying Strings and Arrays}.
+@samp{strcat} is declared in the header file @file{string.h} while
+@samp{wcscat} is declared in @file{wchar.h}.
 
 @comment string.h
 @comment ISO
@@ -827,6 +715,8 @@ strcat (char *restrict to, const char *restrict from)
 @end smallexample
 
 This function has undefined results if the strings overlap.
+
+As noted below, this function has significant performance issues.
 @end deftypefun
 
 @comment wchar.h
@@ -850,10 +740,13 @@ wcscat (wchar_t *wto, const wchar_t *wfrom)
 @end smallexample
 
 This function has undefined results if the strings overlap.
+
+As noted below, this function has significant performance issues.
 @end deftypefun
 
 Programmers using the @code{strcat} or @code{wcscat} function (or the
-following @code{strncat} or @code{wcsncat} functions for that matter)
+@code{strncat} or @code{wcsncat} functions defined in
+a later section, for that matter)
 can easily be recognized as lazy and reckless.  In almost all situations
 the lengths of the participating strings are known (it better should be
 since how can one otherwise ensure the allocated size of the buffer is
@@ -978,6 +871,165 @@ should think twice and look through the program whether the code cannot
 be rewritten to take advantage of already calculated results.  Again: it
 is almost always unnecessary to use @code{strcat}.
 
+@node Truncating Strings
+@section Truncating Strings while Copying
+@cindex truncating strings
+@cindex string truncation
+
+The functions described in this section copy or concatenate the
+possibly-truncated contents of a string or array to another, and
+similarly for wide strings.  They follow the string-copying functions
+in their header conventions.  @xref{Copying Strings and Arrays}.  The
+@samp{str} functions are declared in the header file @file{string.h}
+and the @samp{wc} functions are declared in the file @file{wchar.h}.
+
+@comment string.h
+@deftypefun {char *} strncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{strcpy} but always copies exactly
+@var{size} bytes into @var{to}.
+
+If @var{from} does not contain a null byte in its first @var{size}
+bytes, @code{strncpy} copies just the first @var{size} bytes.  In this
+case no null terminator is written into @var{to}.
+
+Otherwise @var{from} must be a string with length less than
+@var{size}.  In this case @code{strncpy} copies all of @var{from},
+followed by enough null bytes to add up to @var{size} bytes in all.
+
+The behavior of @code{strncpy} is undefined if the strings overlap.
+
+This function was designed for now-rarely-used arrays consisting of
+non-null bytes followed by zero or more null bytes.  It needs to set
+all @var{size} bytes of the destination, even when @var{size} is much
+greater than the length of @var{from}.  As noted below, this function
+is generally a poor choice for processing text.
+@end deftypefun
+
+@comment wchar.h
+@comment ISO
+@deftypefun {wchar_t *} wcsncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{wcscpy} but always copies exactly
+@var{size} wide characters into @var{wto}.
+
+If @var{wfrom} does not contain a null wide character in its first
+@var{size} wide characters, then @code{wcsncpy} copies just the first
+@var{size} wide characters.  In this case no null terminator is
+written into @var{wto}.
+
+Otherwise @var{wfrom} must be a wide string with length less than
+@var{size}.  In this case @code{wcsncpy} copies all of @var{wfrom},
+followed by enough null wide characters to add up to @var{size} wide
+characters in all.
+
+The behavior of @code{wcsncpy} is undefined if the strings overlap.
+
+This function is the wide-character counterpart of @code{strncpy} and
+suffers from most of the problems that @code{strncpy} does.  For
+example, as noted below, this function is generally a poor choice for
+processing text.
+@end deftypefun
+
+@comment string.h
+@comment GNU
+@deftypefun {char *} strndup (const char *@var{s}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
+This function is similar to @code{strdup} but always copies at most
+@var{size} bytes into the newly allocated string.
+
+If the length of @var{s} is more than @var{size}, then @code{strndup}
+copies just the first @var{size} bytes and adds a closing null byte.
+Otherwise all bytes are copied and the string is terminated.
+
+This function differs from @code{strncpy} in that it always terminates
+the destination string.
+
+As noted below, this function is generally a poor choice for
+processing text.
+
+@code{strndup} is a GNU extension.
+@end deftypefun
+
+@comment string.h
+@comment GNU
+@deftypefn {Macro} {char *} strndupa (const char *@var{s}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{strndup} but like @code{strdupa} it
+allocates the new string using @code{alloca} @pxref{Variable Size
+Automatic}.  The same advantages and limitations of @code{strdupa} are
+valid for @code{strndupa}, too.
+
+This function is implemented only as a macro, just like @code{strdupa}.
+Just as @code{strdupa} this macro also must not be used inside the
+parameter list in a function call.
+
+As noted below, this function is generally a poor choice for
+processing text.
+
+@code{strndupa} is only available if GNU CC is used.
+@end deftypefn
+
+@comment string.h
+@comment GNU
+@deftypefun {char *} stpncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{stpcpy} but copies always exactly
+@var{size} bytes into @var{to}.
+
+If the length of @var{from} is more than @var{size}, then @code{stpncpy}
+copies just the first @var{size} bytes and returns a pointer to the
+byte directly following the one which was copied last.  Note that in
+this case there is no null terminator written into @var{to}.
+
+If the length of @var{from} is less than @var{size}, then @code{stpncpy}
+copies all of @var{from}, followed by enough null bytes to add up
+to @var{size} bytes in all.  This behavior is rarely useful, but it
+is implemented to be useful in contexts where this behavior of the
+@code{strncpy} is used.  @code{stpncpy} returns a pointer to the
+@emph{first} written null byte.
+
+This function is not part of ISO or POSIX but was found useful while
+developing @theglibc{} itself.
+
+Its behavior is undefined if the strings overlap.  The function is
+declared in @file{string.h}.
+
+As noted below, this function is generally a poor choice for
+processing text.
+@end deftypefun
+
+@comment wchar.h
+@comment GNU
+@deftypefun {wchar_t *} wcpncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{wcpcpy} but copies always exactly
+@var{wsize} wide characters into @var{wto}.
+
+If the length of @var{wfrom} is more than @var{size}, then
+@code{wcpncpy} copies just the first @var{size} wide characters and
+returns a pointer to the wide character directly following the last
+non-null wide character which was copied last.  Note that in this case
+there is no null terminator written into @var{wto}.
+
+If the length of @var{wfrom} is less than @var{size}, then @code{wcpncpy}
+copies all of @var{wfrom}, followed by enough null wide characters to add up
+to @var{size} wide characters in all.  This behavior is rarely useful, but it
+is implemented to be useful in contexts where this behavior of the
+@code{wcsncpy} is used.  @code{wcpncpy} returns a pointer to the
+@emph{first} written null wide character.
+
+This function is not part of ISO or POSIX but was found useful while
+developing @theglibc{} itself.
+
+Its behavior is undefined if the strings overlap.
+
+As noted below, this function is generally a poor choice for
+processing text.
+
+@code{wcpncpy} is a GNU extension.
+@end deftypefun
+
 @comment string.h
 @comment ISO
 @deftypefun {char *} strncat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
@@ -1004,6 +1056,12 @@ strncat (char *to, const char *from, size_t size)
 @end smallexample
 
 The behavior of @code{strncat} is undefined if the strings overlap.
+
+As a companion to @code{strncpy}, @code{strncat} was designed for
+now-rarely-used arrays consisting of non-null bytes followed by zero
+or more null bytes.  As noted below, this function is generally a poor
+choice for processing text.  Also, this function has significant
+performance issues.  @xref{Concatenating Strings}.
 @end deftypefun
 
 @comment wchar.h
@@ -1025,7 +1083,8 @@ wchar_t *
 wcsncat (wchar_t *restrict wto, const wchar_t *restrict wfrom,
          size_t size)
 @{
-  memcpy (wto + wcslen (wto), wfrom, wcsnlen (wfrom, size) * sizeof (wchar_t));
+  memcpy (wto + wcslen (wto), wfrom,
+          wcsnlen (wfrom, size) * sizeof (wchar_t));
   wto[wcslen (to) + wcsnlen (wfrom, size)] = '\0';
   return wto;
 @}
@@ -1033,42 +1092,39 @@ wcsncat (wchar_t *restrict wto, const wchar_t *restrict wfrom,
 @end smallexample
 
 The behavior of @code{wcsncat} is undefined if the strings overlap.
-@end deftypefun
-
-Here is an example showing the use of @code{strncpy} and @code{strncat}
-(the wide character version is equivalent).  Notice how, in the call to
-@code{strncat}, the @var{size} parameter is computed to avoid
-overflowing the array @code{buffer}.
-
-@smallexample
-@include strncat.c.texi
-@end smallexample
-
-@noindent
-The output produced by this program looks like:
-
-@smallexample
-hello
-hello, wo
-@end smallexample
 
-@comment string.h
-@comment BSD
-@deftypefun void bcopy (const void *@var{from}, void *@var{to}, size_t @var{size})
-@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
-This is a partially obsolete alternative for @code{memmove}, derived from
-BSD.  Note that it is not quite equivalent to @code{memmove}, because the
-arguments are not in the same order and there is no return value.
-@end deftypefun
-
-@comment string.h
-@comment BSD
-@deftypefun void bzero (void *@var{block}, size_t @var{size})
-@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
-This is a partially obsolete alternative for @code{memset}, derived from
-BSD.  Note that it is not as general as @code{memset}, because the only
-value it can store is zero.
-@end deftypefun
+As noted below, this function is generally a poor choice for
+processing text.  Also, this function has significant performance
+issues.  @xref{Concatenating Strings}.
+@end deftypefun
+
+Because these functions can abruptly truncate strings or wide strings,
+they are generally poor choices for processing text.  When coping or
+concatening multibyte strings, they can truncate within a multibyte
+character so that the result is not a valid multibyte character
+string.  When combining or concatenating multibyte or wide strings,
+they may truncate the output after a combining character, resulting in
+a corrupted grapheme.  They can cause bugs even when processing
+single-byte strings: for example, when calculating an ASCII-only user
+name, a truncated name can identify the wrong user.
+
+Although some buffer overruns can be prevented by manually replacing
+calls to copying functions with calls to truncation functions,
+nowadays there are easier and more-reliable automatic techniques that
+cause buffer overruns to reliably terminate a program.  These include
+GCC's @option{-fsanitize=address} option and, if the destination
+buffer is statically sized, defining the @code{_FORTIFY_SOURCE} macro.
+Because truncation functions can mask application bugs that would
+otherwise be caught by the automatic techniques, these functions
+should be used only when the application's underlying logic requires
+truncation.
+
+@strong{Note:} GNU programs should not truncate strings or wide
+strings to fit arbitrary size limits.  @xref{Semantics, , Writing
+Robust Programs, standards, The GNU Coding Standards}.  Instead of
+string-truncation functions, it is usually better to use dynamic
+memory allocation (@pxref{Unconstrained Allocation}) and functions
+such as @code{strdup} or @code{asprintf} to construct strings.
 
 @node String/Array Comparison
 @section String/Array Comparison
@@ -1473,7 +1529,7 @@ to @var{size} bytes (including a terminating null byte) are
 stored.
 
 The behavior is undefined if the strings @var{to} and @var{from}
-overlap; see @ref{Copying and Concatenation}.
+overlap; see @ref{Copying Strings and Arrays}.
 
 The return value is the length of the entire transformed string.  This
 value is not affected by the value of @var{size}, but if it is greater
@@ -1503,7 +1559,7 @@ selected for collation, and stores the transformed string in the array
 wide character) are stored.
 
 The behavior is undefined if the strings @var{wto} and @var{wfrom}
-overlap; see @ref{Copying and Concatenation}.
+overlap; see @ref{Copying Strings and Arrays}.
 
 The return value is the length of the entire transformed wide
 string.  This value is not affected by the value of @var{size}, but if
@@ -2113,8 +2169,8 @@ if the remainder of string consists only of delimiter wide characters,
 
 @strong{Warning:} Since @code{strtok} and @code{wcstok} alter the string
 they is parsing, you should always copy the string to a temporary buffer
-before parsing it with @code{strtok}/@code{wcstok} (@pxref{Copying and
-Concatenation}).  If you allow @code{strtok} or @code{wcstok} to modify
+before parsing it with @code{strtok}/@code{wcstok} (@pxref{Copying Strings
+and Arrays}).  If you allow @code{strtok} or @code{wcstok} to modify
 a string that came from another part of your program, you are asking for
 trouble; that string might be used for other purposes after
 @code{strtok} or @code{wcstok} has modified it, and it would not have
-- 
2.1.0

>From 7d64e0be6baff7155cb753a653952c164894dfa6 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 4 Dec 2015 12:11:14 -0800
Subject: [PATCH 2/2] Add strlcpy, strlcat

[BZ #178]
This patch was derived from text by Florian Weimer in:
https://sourceware.org/ml/libc-alpha/2015-11/msg00558.html
* manual/string.texi (Truncating Strings): New functions from BSD.
---
 ChangeLog          |   6 ++++
 manual/string.texi | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 083a08d..c822b34 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,11 @@
 2015-12-04  Paul Eggert  <eggert@cs.ucla.edu>
 
+	Add strlcpy, strlcat
+	[BZ #178]
+	This patch was derived from text by Florian Weimer in:
+	https://sourceware.org/ml/libc-alpha/2015-11/msg00558.html
+	* manual/string.texi (Truncating Strings): New functions from BSD.
+
 	Split large string section; add truncation advice
 	* manual/examples/strncat.c: Remove.
 	This example was misleading, as the code would have undefined
diff --git a/manual/string.texi b/manual/string.texi
index 5b11b3b..398b41c 100644
--- a/manual/string.texi
+++ b/manual/string.texi
@@ -745,7 +745,7 @@ As noted below, this function has significant performance issues.
 @end deftypefun
 
 Programmers using the @code{strcat} or @code{wcscat} function (or the
-@code{strncat} or @code{wcsncat} functions defined in
+@code{strlcat}, @code{strncat}, or @code{wcsncat} functions defined in
 a later section, for that matter)
 can easily be recognized as lazy and reckless.  In almost all situations
 the lengths of the participating strings are known (it better should be
@@ -906,6 +906,47 @@ greater than the length of @var{from}.  As noted below, this function
 is generally a poor choice for processing text.
 @end deftypefun
 
+@comment string.h
+@comment BSD
+@deftypefun size_t strlcpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{strcpy}, but copies at most
+@var{size} bytes from the string @var{from} into the destination
+buffer @var{to}, including a terminating null byte.
+
+If @var{size} is greater than the length of @var{from}, this function
+copies all of the string @var{from} to the destination buffer
+@var{to}, including the terminating null byte.  Like other string
+functions such as @code{strcpy}, but unlike @code{strncpy}, any
+remaining bytes in the destination buffer remain unchanged.
+
+If @var{size} is nonzero and is not greater than the length of the
+string @var{from}, this function copies only the first
+@samp{@var{size} - 1} bytes to the destination buffer @var{to}, and
+writes a terminating null byte to the last byte in the buffer.
+
+This function returns the length of @var{from}.  This means that
+truncation occurs whenever the returned value is not less than
+@var{size}.
+
+The behavior of this function is undefined if @var{size} is zero, if
+the source and destination strings overlap, or if the source or
+destination are null pointers.
+
+Unlike @code{strncpy}, this function always null-terminates the
+destination string, does not zero-fill the destination buffer,
+requires @var{size} to be nonzero, requires @var{from} to be a
+null-terminated string, and always computes @var{from}'s length even
+when greater than @var{size}.
+
+This function was designed as a stopgap for quickly retrofitting
+possibly-unsafe uses of @code{strcpy} on platforms lacking
+buffer-overrun protection.  As noted below, this function is generally
+a poor choice for processing text.
+
+This function is derived from BSD.
+@end deftypefun
+
 @comment wchar.h
 @comment ISO
 @deftypefun {wchar_t *} wcsncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
@@ -1064,6 +1105,65 @@ choice for processing text.  Also, this function has significant
 performance issues.  @xref{Concatenating Strings}.
 @end deftypefun
 
+@comment string.h
+@comment BSD
+@deftypefun size_t strlcat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{strcat}, except that it truncates
+@var{to} to at most @var{size} bytes (including the terminating null
+byte) when appending the string @var{from} to the string @var{to}.
+
+This function appends a prefix of the string @var{from} to the string
+@var{to}.  The prefix contains all of @var{from} if that can be
+appended to @var{to} without requiring more than @var{size} bytes
+total, including the null terminator.  Otherwise, it contains only as
+many leading bytes of @var{from} as will fit.  The resulting string in
+@var{to} is always null-terminated, and any excess trailing bytes of
+@var{from} are not copied.
+
+This function returns the sum of the original length of @var{to} and
+the length of @var{from}.  This means that truncation occurs whenever
+the returned value is not less than @var{size}.
+
+The behavior is undefined if @var{to} does not contain a null byte in
+its first @var{size} bytes, if the source and resulting destination
+strings overlap, if the source or destination are null pointers, or if
+the result would exceed @code{SIZE_MAX}.
+
+@strong{Portability Note:} With @theglibc{} this function's behavior
+is never undefined due to the result exceeding @code{SIZE_MAX}, as
+@var{from} and @var{to} cannot overlap and @theglibc{} assumes a flat
+address space.
+
+This function could be implemented like this:
+
+@smallexample
+@group
+size_t
+strlcat (char *to, const char *from, size_t size)
+@{
+  size_t len = strlen (to);
+  return len + strlcpy (to + len, from, size - len);
+@}
+@end group
+@end smallexample
+
+Whereas @code{strncat} limits the length of the appended string, this
+function limits the total destination size.  Also, unlike
+@code{strncat}, this function requires @var{from} to be
+null-terminated and always computes @var{from}'s length even when
+greater than the limit.
+
+As a companion to @code{strlcpy}, this function was designed as a
+stopgap for quickly retrofitting possibly-unsafe uses of @code{strcat}
+on platforms lacking buffer-overrun protection.  As noted below, this
+function is generally a poor choice for processing text.  Also, like
+@code{strcat} this function has significant performance issues.
+@xref{Concatenating Strings}.
+
+This function is derived from BSD.
+@end deftypefun
+
 @comment wchar.h
 @comment ISO
 @deftypefun {wchar_t *} wcsncat (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
-- 
2.1.0


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]