]> sourceware.org Git - glibc.git/blame - manual/string.texi
Rename new tst-sem17 test to tst-sem18
[glibc.git] / manual / string.texi
CommitLineData
390955cb 1@node String and Array Utilities, Character Set Handling, Character Handling, Top
7a68c94a 2@c %MENU% Utilities for copying and comparing strings and arrays
28f540f4
RM
3@chapter String and Array Utilities
4
2cc4b9cc 5Operations on strings (null-terminated byte sequences) are an important part of
1f77f049 6many programs. @Theglibc{} provides an extensive set of string
28f540f4
RM
7utility functions, including functions for copying, concatenating,
8comparing, and searching strings. Many of these functions can also
9operate on arbitrary regions of storage; for example, the @code{memcpy}
a5113b14 10function can be used to copy the contents of any kind of array.
28f540f4
RM
11
12It's fairly common for beginning C programmers to ``reinvent the wheel''
13by duplicating this functionality in their own code, but it pays to
14become familiar with the library functions and to make use of them,
15since this offers benefits in maintenance, efficiency, and portability.
16
17For instance, you could easily compare one string to another in two
18lines of C code, but if you use the built-in @code{strcmp} function,
19you're less likely to make a mistake. And, since these library
20functions are typically highly optimized, your program may run faster
21too.
22
23@menu
24* Representation of Strings:: Introduction to basic concepts.
25* String/Array Conventions:: Whether to use a string function or an
26 arbitrary array function.
27* String Length:: Determining the length of a string.
0a13c9e9
PE
28* Copying Strings and Arrays:: Functions to copy strings and arrays.
29* Concatenating Strings:: Functions to concatenate strings while copying.
30* Truncating Strings:: Functions to truncate strings while copying.
28f540f4
RM
31* String/Array Comparison:: Functions for byte-wise and character-wise
32 comparison.
33* Collation Functions:: Functions for collating strings.
34* Search Functions:: Searching for a specific element or substring.
35* Finding Tokens in a String:: Splitting a string into tokens by looking
36 for delimiters.
ea1bd74d
ZW
37* Erasing Sensitive Data:: Clearing memory which contains sensitive
38 data, after it's no longer needed.
b10a0acc
ZW
39* Shuffling Bytes:: Or how to flash-cook a string.
40* Obfuscating Data:: Reversibly obscuring data from casual view.
b4012b75 41* Encode Binary Data:: Encoding and Decoding of Binary Data.
b13927da 42* Argz and Envz Vectors:: Null-separated string vectors.
28f540f4
RM
43@end menu
44
b4012b75 45@node Representation of Strings
28f540f4
RM
46@section Representation of Strings
47@cindex string, representation of
48
49This section is a quick summary of string concepts for beginning C
2cc4b9cc 50programmers. It describes how strings are represented in C
28f540f4
RM
51and some common pitfalls. If you are already familiar with this
52material, you can skip this section.
53
54@cindex string
2cc4b9cc
PE
55A @dfn{string} is a null-terminated array of bytes of type @code{char},
56including the terminating null byte. String-valued
28f540f4 57variables are usually declared to be pointers of type @code{char *}.
1fb22592 58Such variables do not include space for the contents of a string; that has
28f540f4
RM
59to be stored somewhere else---in an array variable, a string constant,
60or dynamically allocated memory (@pxref{Memory Allocation}). It's up to
61you to store the address of the chosen memory space into the pointer
62variable. Alternatively you can store a @dfn{null pointer} in the
63pointer variable. The null pointer does not point anywhere, so
64attempting to reference the string it points to gets an error.
65
2cc4b9cc
PE
66@cindex multibyte character
67@cindex multibyte string
68@cindex wide string
69A @dfn{multibyte character} is a sequence of one or more bytes that
70represents a single character using the locale's encoding scheme; a
71null byte always represents the null character. A @dfn{multibyte
72string} is a string that consists entirely of multibyte
73characters. In contrast, a @dfn{wide string} is a null-terminated
74sequence of @code{wchar_t} objects. A wide-string variable is usually
75declared to be a pointer of type @code{wchar_t *}, by analogy with
76string variables and @code{char *}. @xref{Extended Char Intro}.
77
78@cindex null byte
8a2f1f5b 79@cindex null wide character
2cc4b9cc
PE
80By convention, the @dfn{null byte}, @code{'\0'},
81marks the end of a string and the @dfn{null wide character},
82@code{L'\0'}, marks the end of a wide string. For example, in
8a2f1f5b 83testing to see whether the @code{char *} variable @var{p} points to a
2cc4b9cc 84null byte marking the end of a string, you can write
8a2f1f5b 85@code{!*@var{p}} or @code{*@var{p} == '\0'}.
28f540f4 86
2cc4b9cc
PE
87A null byte is quite different conceptually from a null pointer,
88although both are represented by the integer constant @code{0}.
28f540f4
RM
89
90@cindex string literal
2cc4b9cc
PE
91A @dfn{string literal} appears in C program source as a multibyte
92string between double-quote characters (@samp{"}). If the
93initial double-quote character is immediately preceded by a capital
94@samp{L} (ell) character (as in @code{L"foo"}), it is a wide string
95literal. String literals can also contribute to @dfn{string
96concatenation}: @code{"a" "b"} is the same as @code{"ab"}.
97For wide strings one can use either
8a2f1f5b
UD
98@code{L"a" L"b"} or @code{L"a" "b"}. Modification of string literals is
99not allowed by the GNU C compiler, because literals are placed in
100read-only storage.
28f540f4 101
2cc4b9cc 102Arrays that are declared @code{const} cannot be modified
28f540f4
RM
103either. It's generally good style to declare non-modifiable string
104pointers to be of type @code{const char *}, since this often allows the
105C compiler to detect accidental modifications as well as providing some
106amount of documentation about what your program intends to do with the
107string.
108
2cc4b9cc
PE
109The amount of memory allocated for a byte array may extend past the null byte
110that marks the end of the string that the array contains. In this
dd7d45e8 111document, the term @dfn{allocated size} is always used to refer to the
2cc4b9cc
PE
112total amount of memory allocated for an array, while the term
113@dfn{length} refers to the number of bytes up to (but not including)
114the terminating null byte. Wide strings are similar, except their
115sizes and lengths count wide characters, not bytes.
28f540f4
RM
116@cindex length of string
117@cindex allocation size of string
118@cindex size of string
119@cindex string length
120@cindex string allocation
121
2cc4b9cc 122A notorious source of program bugs is trying to put more bytes into a
28f540f4 123string than fit in its allocated size. When writing code that extends
2cc4b9cc 124strings or moves bytes into a pre-allocated array, you should be
1fb22592 125very careful to keep track of the length of the string and make explicit
28f540f4
RM
126checks for overflowing the array. Many of the library functions
127@emph{do not} do this for you! Remember also that you need to allocate
2cc4b9cc 128an extra byte to hold the null byte that marks the end of the
28f540f4
RM
129string.
130
8a2f1f5b
UD
131@cindex single-byte string
132@cindex multibyte string
2cc4b9cc 133Originally strings were sequences of bytes where each byte represented a
8a2f1f5b
UD
134single character. This is still true today if the strings are encoded
135using a single-byte character encoding. Things are different if the
136strings are encoded using a multibyte encoding (for more information on
137encodings see @ref{Extended Char Intro}). There is no difference in
138the programming interface for these two kind of strings; the programmer
139has to be aware of this and interpret the byte sequences accordingly.
140
141But since there is no separate interface taking care of these
142differences the byte-based string functions are sometimes hard to use.
143Since the count parameters of these functions specify bytes a call to
2cc4b9cc 144@code{memcpy} could cut a multibyte character in the middle and put an
8a2f1f5b
UD
145incomplete (and therefore unusable) byte sequence in the target buffer.
146
2cc4b9cc 147@cindex wide string
8a2f1f5b
UD
148To avoid these problems later versions of the @w{ISO C} standard
149introduce a second set of functions which are operating on @dfn{wide
150characters} (@pxref{Extended Char Intro}). These functions don't have
151the problems the single-byte versions have since every wide character is
152a legal, interpretable value. This does not mean that cutting wide
2cc4b9cc 153strings at arbitrary points is without problems. It normally
8a2f1f5b
UD
154is for alphabet-based languages (except for non-normalized text) but
155languages based on syllables still have the problem that more than one
156wide character is necessary to complete a logical unit. This is a
157higher level problem which the @w{C library} functions are not designed
158to solve. But it is at least good that no invalid byte sequences can be
2cc4b9cc
PE
159created. Also, the higher level functions can also much more easily operate
160on wide characters than on multibyte characters so that a common strategy
8a2f1f5b
UD
161is to use wide characters internally whenever text is more than simply
162copied.
163
164The remaining of this chapter will discuss the functions for handling
2cc4b9cc
PE
165wide strings in parallel with the discussion of
166strings since there is almost always an exact equivalent
8a2f1f5b
UD
167available.
168
b4012b75 169@node String/Array Conventions
28f540f4
RM
170@section String and Array Conventions
171
172This chapter describes both functions that work on arbitrary arrays or
2cc4b9cc
PE
173blocks of memory, and functions that are specific to strings and wide
174strings.
28f540f4
RM
175
176Functions that operate on arbitrary blocks of memory have names
8a2f1f5b
UD
177beginning with @samp{mem} and @samp{wmem} (such as @code{memcpy} and
178@code{wmemcpy}) and invariably take an argument which specifies the size
179(in bytes and wide characters respectively) of the block of memory to
28f540f4 180operate on. The array arguments and return values for these functions
d1dcb565 181have type @code{void *} or @code{wchar_t *}. As a matter of style, the
8a2f1f5b
UD
182elements of the arrays used with the @samp{mem} functions are referred
183to as ``bytes''. You can pass any kind of pointer to these functions,
184and the @code{sizeof} operator is useful in computing the value for the
185size argument. Parameters to the @samp{wmem} functions must be of type
186@code{wchar_t *}. These functions are not really usable with anything
187but arrays of this type.
188
189In contrast, functions that operate specifically on strings and wide
2cc4b9cc 190strings have names beginning with @samp{str} and @samp{wcs}
8a2f1f5b 191respectively (such as @code{strcpy} and @code{wcscpy}) and look for a
2cc4b9cc 192terminating null byte or null wide character instead of requiring an explicit
8a2f1f5b 193size argument to be passed. (Some of these functions accept a specified
2cc4b9cc
PE
194maximum length, but they also check for premature termination.)
195The array arguments and return values for these
8a2f1f5b 196functions have type @code{char *} and @code{wchar_t *} respectively, and
2cc4b9cc 197the array elements are referred to as ``bytes'' and ``wide
8a2f1f5b
UD
198characters''.
199
200In many cases, there are both @samp{mem} and @samp{str}/@samp{wcs}
201versions of a function. The one that is more appropriate to use depends
202on the exact situation. When your program is manipulating arbitrary
203arrays or blocks of storage, then you should always use the @samp{mem}
2cc4b9cc 204functions. On the other hand, when you are manipulating
8a2f1f5b
UD
205strings it is usually more convenient to use the @samp{str}/@samp{wcs}
206functions, unless you already know the length of the string in advance.
207The @samp{wmem} functions should be used for wide character arrays with
208known size.
209
210@cindex wint_t
211@cindex parameter promotion
212Some of the memory and string functions take single characters as
213arguments. Since a value of type @code{char} is automatically promoted
9dcc8f11 214into a value of type @code{int} when used as a parameter, the functions
8a2f1f5b 215are declared with @code{int} as the type of the parameter in question.
2cc4b9cc 216In case of the wide character functions the situation is similar: the
8a2f1f5b
UD
217parameter type for a single wide character is @code{wint_t} and not
218@code{wchar_t}. This would for many implementations not be necessary
2cc4b9cc 219since @code{wchar_t} is large enough to not be automatically
8a2f1f5b
UD
220promoted, but since the @w{ISO C} standard does not require such a
221choice of types the @code{wint_t} type is used.
28f540f4 222
b4012b75 223@node String Length
28f540f4
RM
224@section String Length
225
226You can get the length of a string using the @code{strlen} function.
227This function is declared in the header file @file{string.h}.
228@pindex string.h
229
28f540f4 230@deftypefun size_t strlen (const char *@var{s})
d08a7e4c 231@standards{ISO, string.h}
11087373 232@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2cc4b9cc 233The @code{strlen} function returns the length of the
8a2f1f5b 234string @var{s} in bytes. (In other words, it returns the offset of the
2cc4b9cc 235terminating null byte within the array.)
28f540f4
RM
236
237For example,
238@smallexample
239strlen ("hello, world")
240 @result{} 12
241@end smallexample
242
2cc4b9cc 243When applied to an array, the @code{strlen} function returns
dd7d45e8 244the length of the string stored there, not its allocated size. You can
2cc4b9cc 245get the allocated size of the array that holds a string using
28f540f4
RM
246the @code{sizeof} operator:
247
248@smallexample
a5113b14 249char string[32] = "hello, world";
28f540f4
RM
250sizeof (string)
251 @result{} 32
252strlen (string)
253 @result{} 12
254@end smallexample
dd7d45e8 255
2cc4b9cc 256But beware, this will not work unless @var{string} is the
dd7d45e8
UD
257array itself, not a pointer to it. For example:
258
259@smallexample
260char string[32] = "hello, world";
261char *ptr = string;
262sizeof (string)
263 @result{} 32
264sizeof (ptr)
265 @result{} 4 /* @r{(on a machine with 4 byte pointers)} */
266@end smallexample
267
268This is an easy mistake to make when you are working with functions that
269take string arguments; those arguments are always pointers, not arrays.
270
8a2f1f5b
UD
271It must also be noted that for multibyte encoded strings the return
272value does not have to correspond to the number of characters in the
273string. To get this value the string can be converted to wide
274characters and @code{wcslen} can be used or something like the following
275code can be used:
276
277@smallexample
278/* @r{The input is in @code{string}.}
279 @r{The length is expected in @code{n}.} */
280@{
281 mbstate_t t;
282 char *scopy = string;
283 /* In initial state. */
284 memset (&t, '\0', sizeof (t));
285 /* Determine number of characters. */
286 n = mbsrtowcs (NULL, &scopy, strlen (scopy), &t);
287@}
288@end smallexample
289
290This is cumbersome to do so if the number of characters (as opposed to
291bytes) is needed often it is better to work with wide characters.
292@end deftypefun
293
294The wide character equivalent is declared in @file{wchar.h}.
295
8a2f1f5b 296@deftypefun size_t wcslen (const wchar_t *@var{ws})
d08a7e4c 297@standards{ISO, wchar.h}
11087373 298@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
299The @code{wcslen} function is the wide character equivalent to
300@code{strlen}. The return value is the number of wide characters in the
2cc4b9cc 301wide string pointed to by @var{ws} (this is also the offset of
8a2f1f5b
UD
302the terminating null wide character of @var{ws}).
303
2cc4b9cc 304Since there are no multi wide character sequences making up one wide
8a2f1f5b
UD
305character the return value is not only the offset in the array, it is
306also the number of wide characters.
307
308This function was introduced in @w{Amendment 1} to @w{ISO C90}.
28f540f4
RM
309@end deftypefun
310
4547c1a4 311@deftypefun size_t strnlen (const char *@var{s}, size_t @var{maxlen})
b79238db 312@standards{POSIX.1, string.h}
11087373 313@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b79238db
PE
314This returns the offset of the first null byte in the array @var{s},
315except that it returns @var{maxlen} if the first @var{maxlen} bytes
316are all non-null.
317Therefore this function is equivalent to
ebaf36eb
JM
318@code{(strlen (@var{s}) < @var{maxlen} ? strlen (@var{s}) : @var{maxlen})}
319but it
2cc4b9cc
PE
320is more efficient and works even if @var{s} is not null-terminated so
321long as @var{maxlen} does not exceed the size of @var{s}'s array.
4547c1a4
UD
322
323@smallexample
324char string[32] = "hello, world";
325strnlen (string, 32)
326 @result{} 12
327strnlen (string, 5)
328 @result{} 5
329@end smallexample
330
b79238db
PE
331This function is part of POSIX.1-2008 and later editions, but was
332available in @theglibc{} and other systems as an extension long before
333it was standardized. It is declared in @file{string.h}.
8a2f1f5b
UD
334@end deftypefun
335
8a2f1f5b 336@deftypefun size_t wcsnlen (const wchar_t *@var{ws}, size_t @var{maxlen})
d08a7e4c 337@standards{GNU, wchar.h}
11087373 338@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
339@code{wcsnlen} is the wide character equivalent to @code{strnlen}. The
340@var{maxlen} parameter specifies the maximum number of wide characters.
341
b79238db
PE
342This function is part of POSIX.1-2008 and later editions, and is
343declared in @file{wchar.h}.
4547c1a4
UD
344@end deftypefun
345
0a13c9e9
PE
346@node Copying Strings and Arrays
347@section Copying Strings and Arrays
28f540f4
RM
348
349You can use the functions described in this section to copy the contents
0a13c9e9
PE
350of strings, wide strings, and arrays. The @samp{str} and @samp{mem}
351functions are declared in @file{string.h} while the @samp{w} functions
352are declared in @file{wchar.h}.
28f540f4 353@pindex string.h
8a2f1f5b 354@pindex wchar.h
28f540f4
RM
355@cindex copying strings and arrays
356@cindex string copy functions
357@cindex array copy functions
358@cindex concatenating strings
359@cindex string concatenation functions
360
361A helpful way to remember the ordering of the arguments to the functions
362in this section is that it corresponds to an assignment expression, with
0a13c9e9
PE
363the destination array specified to the left of the source array. Most
364of these functions return the address of the destination array; a few
365return the address of the destination's terminating null, or of just
366past the destination.
28f540f4
RM
367
368Most of these functions do not work properly if the source and
369destination arrays overlap. For example, if the beginning of the
370destination array overlaps the end of the source array, the original
371contents of that part of the source array may get overwritten before it
372is copied. Even worse, in the case of the string functions, the null
2cc4b9cc 373byte marking the end of the string may be lost, and the copy
28f540f4
RM
374function might get stuck in a loop trashing all the memory allocated to
375your program.
376
377All functions that have problems copying between overlapping arrays are
378explicitly identified in this manual. In addition to functions in this
379section, there are a few others like @code{sprintf} (@pxref{Formatted
380Output Functions}) and @code{scanf} (@pxref{Formatted Input
381Functions}).
382
8a2f1f5b 383@deftypefun {void *} memcpy (void *restrict @var{to}, const void *restrict @var{from}, size_t @var{size})
d08a7e4c 384@standards{ISO, string.h}
11087373 385@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
386The @code{memcpy} function copies @var{size} bytes from the object
387beginning at @var{from} into the object beginning at @var{to}. The
388behavior of this function is undefined if the two arrays @var{to} and
389@var{from} overlap; use @code{memmove} instead if overlapping is possible.
390
391The value returned by @code{memcpy} is the value of @var{to}.
392
393Here is an example of how you might use @code{memcpy} to copy the
394contents of an array:
395
396@smallexample
397struct foo *oldarray, *newarray;
398int arraysize;
399@dots{}
400memcpy (new, old, arraysize * sizeof (struct foo));
401@end smallexample
402@end deftypefun
403
79827876 404@deftypefun {wchar_t *} wmemcpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
d08a7e4c 405@standards{ISO, wchar.h}
11087373 406@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
407The @code{wmemcpy} function copies @var{size} wide characters from the object
408beginning at @var{wfrom} into the object beginning at @var{wto}. The
409behavior of this function is undefined if the two arrays @var{wto} and
410@var{wfrom} overlap; use @code{wmemmove} instead if overlapping is possible.
411
412The following is a possible implementation of @code{wmemcpy} but there
413are more optimizations possible.
414
415@smallexample
416wchar_t *
417wmemcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom,
418 size_t size)
419@{
420 return (wchar_t *) memcpy (wto, wfrom, size * sizeof (wchar_t));
421@}
422@end smallexample
423
424The value returned by @code{wmemcpy} is the value of @var{wto}.
425
426This function was introduced in @w{Amendment 1} to @w{ISO C90}.
427@end deftypefun
428
8a2f1f5b 429@deftypefun {void *} mempcpy (void *restrict @var{to}, const void *restrict @var{from}, size_t @var{size})
d08a7e4c 430@standards{GNU, string.h}
11087373 431@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
4547c1a4 432The @code{mempcpy} function is nearly identical to the @code{memcpy}
f2ea0f5b 433function. It copies @var{size} bytes from the object beginning at
4547c1a4 434@code{from} into the object pointed to by @var{to}. But instead of
976780fd 435returning the value of @var{to} it returns a pointer to the byte
4547c1a4
UD
436following the last written byte in the object beginning at @var{to}.
437I.e., the value is @code{((void *) ((char *) @var{to} + @var{size}))}.
438
439This function is useful in situations where a number of objects shall be
440copied to consecutive memory positions.
441
442@smallexample
443void *
444combine (void *o1, size_t s1, void *o2, size_t s2)
445@{
446 void *result = malloc (s1 + s2);
447 if (result != NULL)
448 mempcpy (mempcpy (result, o1, s1), o2, s2);
449 return result;
450@}
451@end smallexample
452
453This function is a GNU extension.
454@end deftypefun
455
8a2f1f5b 456@deftypefun {wchar_t *} wmempcpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
d08a7e4c 457@standards{GNU, wchar.h}
11087373 458@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
459The @code{wmempcpy} function is nearly identical to the @code{wmemcpy}
460function. It copies @var{size} wide characters from the object
461beginning at @code{wfrom} into the object pointed to by @var{wto}. But
462instead of returning the value of @var{wto} it returns a pointer to the
463wide character following the last written wide character in the object
464beginning at @var{wto}. I.e., the value is @code{@var{wto} + @var{size}}.
465
466This function is useful in situations where a number of objects shall be
467copied to consecutive memory positions.
468
469The following is a possible implementation of @code{wmemcpy} but there
470are more optimizations possible.
471
472@smallexample
473wchar_t *
474wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom,
475 size_t size)
476@{
477 return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t));
478@}
479@end smallexample
480
481This function is a GNU extension.
482@end deftypefun
483
28f540f4 484@deftypefun {void *} memmove (void *@var{to}, const void *@var{from}, size_t @var{size})
d08a7e4c 485@standards{ISO, string.h}
11087373 486@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
487@code{memmove} copies the @var{size} bytes at @var{from} into the
488@var{size} bytes at @var{to}, even if those two blocks of space
489overlap. In the case of overlap, @code{memmove} is careful to copy the
490original values of the bytes in the block at @var{from}, including those
491bytes which also belong to the block at @var{to}.
8a2f1f5b
UD
492
493The value returned by @code{memmove} is the value of @var{to}.
494@end deftypefun
495
8ded91fb 496@deftypefun {wchar_t *} wmemmove (wchar_t *@var{wto}, const wchar_t *@var{wfrom}, size_t @var{size})
d08a7e4c 497@standards{ISO, wchar.h}
11087373 498@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
499@code{wmemmove} copies the @var{size} wide characters at @var{wfrom}
500into the @var{size} wide characters at @var{wto}, even if those two
f0f308c1 501blocks of space overlap. In the case of overlap, @code{wmemmove} is
8a2f1f5b
UD
502careful to copy the original values of the wide characters in the block
503at @var{wfrom}, including those wide characters which also belong to the
504block at @var{wto}.
505
506The following is a possible implementation of @code{wmemcpy} but there
507are more optimizations possible.
508
509@smallexample
510wchar_t *
511wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom,
512 size_t size)
513@{
514 return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t));
515@}
516@end smallexample
517
518The value returned by @code{wmemmove} is the value of @var{wto}.
519
520This function is a GNU extension.
28f540f4
RM
521@end deftypefun
522
8a2f1f5b 523@deftypefun {void *} memccpy (void *restrict @var{to}, const void *restrict @var{from}, int @var{c}, size_t @var{size})
d08a7e4c 524@standards{SVID, string.h}
11087373 525@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
526This function copies no more than @var{size} bytes from @var{from} to
527@var{to}, stopping if a byte matching @var{c} is found. The return
528value is a pointer into @var{to} one byte past where @var{c} was copied,
529or a null pointer if no byte matching @var{c} appeared in the first
530@var{size} bytes of @var{from}.
531@end deftypefun
532
28f540f4 533@deftypefun {void *} memset (void *@var{block}, int @var{c}, size_t @var{size})
d08a7e4c 534@standards{ISO, string.h}
11087373 535@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
536This function copies the value of @var{c} (converted to an
537@code{unsigned char}) into each of the first @var{size} bytes of the
538object beginning at @var{block}. It returns the value of @var{block}.
539@end deftypefun
540
8a2f1f5b 541@deftypefun {wchar_t *} wmemset (wchar_t *@var{block}, wchar_t @var{wc}, size_t @var{size})
d08a7e4c 542@standards{ISO, wchar.h}
11087373 543@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
544This function copies the value of @var{wc} into each of the first
545@var{size} wide characters of the object beginning at @var{block}. It
546returns the value of @var{block}.
547@end deftypefun
548
8a2f1f5b 549@deftypefun {char *} strcpy (char *restrict @var{to}, const char *restrict @var{from})
d08a7e4c 550@standards{ISO, string.h}
11087373 551@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2cc4b9cc
PE
552This copies bytes from the string @var{from} (up to and including
553the terminating null byte) into the string @var{to}. Like
28f540f4
RM
554@code{memcpy}, this function has undefined results if the strings
555overlap. The return value is the value of @var{to}.
556@end deftypefun
557
8a2f1f5b 558@deftypefun {wchar_t *} wcscpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom})
d08a7e4c 559@standards{ISO, wchar.h}
11087373 560@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2cc4b9cc 561This copies wide characters from the wide string @var{wfrom} (up to and
8a2f1f5b
UD
562including the terminating null wide character) into the string
563@var{wto}. Like @code{wmemcpy}, this function has undefined results if
564the strings overlap. The return value is the value of @var{wto}.
565@end deftypefun
566
28f540f4 567@deftypefun {char *} strdup (const char *@var{s})
a448ee41 568@standards{SVID, string.h}
11087373 569@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2cc4b9cc 570This function copies the string @var{s} into a newly
28f540f4
RM
571allocated string. The string is allocated using @code{malloc}; see
572@ref{Unconstrained Allocation}. If @code{malloc} cannot allocate space
573for the new string, @code{strdup} returns a null pointer. Otherwise it
574returns a pointer to the new string.
575@end deftypefun
576
8a2f1f5b 577@deftypefun {wchar_t *} wcsdup (const wchar_t *@var{ws})
d08a7e4c 578@standards{GNU, wchar.h}
11087373 579@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2cc4b9cc 580This function copies the wide string @var{ws}
8a2f1f5b
UD
581into a newly allocated string. The string is allocated using
582@code{malloc}; see @ref{Unconstrained Allocation}. If @code{malloc}
583cannot allocate space for the new string, @code{wcsdup} returns a null
2cc4b9cc 584pointer. Otherwise it returns a pointer to the new wide string.
8a2f1f5b
UD
585
586This function is a GNU extension.
587@end deftypefun
588
8a2f1f5b 589@deftypefun {char *} stpcpy (char *restrict @var{to}, const char *restrict @var{from})
d08a7e4c 590@standards{Unknown origin, string.h}
11087373 591@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
592This function is like @code{strcpy}, except that it returns a pointer to
593the end of the string @var{to} (that is, the address of the terminating
2cc4b9cc 594null byte @code{to + strlen (from)}) rather than the beginning.
28f540f4
RM
595
596For example, this program uses @code{stpcpy} to concatenate @samp{foo}
597and @samp{bar} to produce @samp{foobar}, which it then prints.
598
599@smallexample
600@include stpcpy.c.texi
601@end smallexample
602
c30c3f46
RM
603This function is part of POSIX.1-2008 and later editions, but was
604available in @theglibc{} and other systems as an extension long before
605it was standardized.
28f540f4 606
8a2f1f5b
UD
607Its behavior is undefined if the strings overlap. The function is
608declared in @file{string.h}.
609@end deftypefun
610
8a2f1f5b 611@deftypefun {wchar_t *} wcpcpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom})
d08a7e4c 612@standards{GNU, wchar.h}
11087373 613@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
614This function is like @code{wcscpy}, except that it returns a pointer to
615the end of the string @var{wto} (that is, the address of the terminating
2cc4b9cc 616null wide character @code{wto + wcslen (wfrom)}) rather than the beginning.
8a2f1f5b
UD
617
618This function is not part of ISO or POSIX but was found useful while
1f77f049 619developing @theglibc{} itself.
8a2f1f5b
UD
620
621The behavior of @code{wcpcpy} is undefined if the strings overlap.
622
623@code{wcpcpy} is a GNU extension and is declared in @file{wchar.h}.
28f540f4
RM
624@end deftypefun
625
26b4d766 626@deftypefn {Macro} {char *} strdupa (const char *@var{s})
d08a7e4c 627@standards{GNU, string.h}
11087373 628@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
976780fd 629This macro is similar to @code{strdup} but allocates the new string
dd7d45e8
UD
630using @code{alloca} instead of @code{malloc} (@pxref{Variable Size
631Automatic}). This means of course the returned string has the same
632limitations as any block of memory allocated using @code{alloca}.
706074a5 633
dd7d45e8 634For obvious reasons @code{strdupa} is implemented only as a macro;
40a55d20 635you cannot get the address of this function. Despite this limitation
706074a5
UD
636it is a useful function. The following code shows a situation where
637using @code{malloc} would be a lot more expensive.
638
639@smallexample
640@include strdupa.c.texi
641@end smallexample
642
643Please note that calling @code{strtok} using @var{path} directly is
8a2f1f5b
UD
644invalid. It is also not allowed to call @code{strdupa} in the argument
645list of @code{strtok} since @code{strdupa} uses @code{alloca}
646(@pxref{Variable Size Automatic}) can interfere with the parameter
647passing.
706074a5
UD
648
649This function is only available if GNU CC is used.
26b4d766 650@end deftypefn
706074a5 651
0a13c9e9 652@deftypefun void bcopy (const void *@var{from}, void *@var{to}, size_t @var{size})
d08a7e4c 653@standards{BSD, string.h}
11087373 654@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
0a13c9e9
PE
655This is a partially obsolete alternative for @code{memmove}, derived from
656BSD. Note that it is not quite equivalent to @code{memmove}, because the
657arguments are not in the same order and there is no return value.
658@end deftypefun
706074a5 659
0a13c9e9 660@deftypefun void bzero (void *@var{block}, size_t @var{size})
d08a7e4c 661@standards{BSD, string.h}
0a13c9e9
PE
662@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
663This is a partially obsolete alternative for @code{memset}, derived from
664BSD. Note that it is not as general as @code{memset}, because the only
665value it can store is zero.
666@end deftypefun
706074a5 667
0a13c9e9
PE
668@node Concatenating Strings
669@section Concatenating Strings
670@pindex string.h
671@pindex wchar.h
672@cindex concatenating strings
673@cindex string concatenation functions
674
675The functions described in this section concatenate the contents of a
676string or wide string to another. They follow the string-copying
677functions in their conventions. @xref{Copying Strings and Arrays}.
678@samp{strcat} is declared in the header file @file{string.h} while
679@samp{wcscat} is declared in @file{wchar.h}.
706074a5 680
1fb22592
PE
681As noted below, these functions are problematic as their callers may
682have performance issues.
683
8a2f1f5b 684@deftypefun {char *} strcat (char *restrict @var{to}, const char *restrict @var{from})
d08a7e4c 685@standards{ISO, string.h}
11087373 686@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4 687The @code{strcat} function is similar to @code{strcpy}, except that the
2cc4b9cc
PE
688bytes from @var{from} are concatenated or appended to the end of
689@var{to}, instead of overwriting it. That is, the first byte from
690@var{from} overwrites the null byte marking the end of @var{to}.
28f540f4
RM
691
692An equivalent definition for @code{strcat} would be:
693
694@smallexample
695char *
8a2f1f5b 696strcat (char *restrict to, const char *restrict from)
28f540f4
RM
697@{
698 strcpy (to + strlen (to), from);
699 return to;
700@}
701@end smallexample
702
703This function has undefined results if the strings overlap.
0a13c9e9
PE
704
705As noted below, this function has significant performance issues.
28f540f4
RM
706@end deftypefun
707
8a2f1f5b 708@deftypefun {wchar_t *} wcscat (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom})
d08a7e4c 709@standards{ISO, wchar.h}
11087373 710@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b 711The @code{wcscat} function is similar to @code{wcscpy}, except that the
2cc4b9cc
PE
712wide characters from @var{wfrom} are concatenated or appended to the end of
713@var{wto}, instead of overwriting it. That is, the first wide character from
714@var{wfrom} overwrites the null wide character marking the end of @var{wto}.
8a2f1f5b
UD
715
716An equivalent definition for @code{wcscat} would be:
717
718@smallexample
719wchar_t *
720wcscat (wchar_t *wto, const wchar_t *wfrom)
721@{
722 wcscpy (wto + wcslen (wto), wfrom);
723 return wto;
724@}
725@end smallexample
726
727This function has undefined results if the strings overlap.
0a13c9e9
PE
728
729As noted below, this function has significant performance issues.
8a2f1f5b
UD
730@end deftypefun
731
d2fda60e
PE
732Programmers using the @code{strcat} or @code{wcscat} functions (or the
733@code{strlcat}, @code{strncat} and @code{wcsncat} functions defined in
0a13c9e9 734a later section, for that matter)
8a2f1f5b
UD
735can easily be recognized as lazy and reckless. In almost all situations
736the lengths of the participating strings are known (it better should be
737since how can one otherwise ensure the allocated size of the buffer is
738sufficient?) Or at least, one could know them if one keeps track of the
ee2752ea 739results of the various function calls. But then it is very inefficient
8a2f1f5b
UD
740to use @code{strcat}/@code{wcscat}. A lot of time is wasted finding the
741end of the destination string so that the actual copying can start.
742This is a common example:
ee2752ea 743
ee2752ea
UD
744@cindex va_copy
745@smallexample
49c091e5 746/* @r{This function concatenates arbitrarily many strings. The last}
ee2752ea
UD
747 @r{parameter must be @code{NULL}.} */
748char *
8a2f1f5b 749concat (const char *str, @dots{})
ee2752ea
UD
750@{
751 va_list ap, ap2;
752 size_t total = 1;
ee2752ea
UD
753
754 va_start (ap, str);
b5982523 755 va_copy (ap2, ap);
ee2752ea
UD
756
757 /* @r{Determine how much space we need.} */
bdc674d9 758 for (const char *s = str; s != NULL; s = va_arg (ap, const char *))
ee2752ea
UD
759 total += strlen (s);
760
761 va_end (ap);
762
bdc674d9 763 char *result = malloc (total);
ee2752ea
UD
764 if (result != NULL)
765 @{
766 result[0] = '\0';
767
768 /* @r{Copy the strings.} */
769 for (s = str; s != NULL; s = va_arg (ap2, const char *))
770 strcat (result, s);
771 @}
772
773 va_end (ap2);
774
775 return result;
776@}
777@end smallexample
778
779This looks quite simple, especially the second loop where the strings
780are actually copied. But these innocent lines hide a major performance
781penalty. Just imagine that ten strings of 100 bytes each have to be
782concatenated. For the second string we search the already stored 100
783bytes for the end of the string so that we can append the next string.
784For all strings in total the comparisons necessary to find the end of
785the intermediate results sums up to 5500! If we combine the copying
786with the search for the allocation we can write this function more
f0f308c1 787efficiently:
ee2752ea
UD
788
789@smallexample
790char *
8a2f1f5b 791concat (const char *str, @dots{})
ee2752ea 792@{
ee2752ea 793 size_t allocated = 100;
bdc674d9 794 char *result = malloc (allocated);
ee2752ea 795
623281e0 796 if (result != NULL)
ee2752ea 797 @{
bdc674d9
PE
798 va_list ap;
799 size_t resultlen = 0;
ee2752ea
UD
800 char *newp;
801
623281e0 802 va_start (ap, str);
ee2752ea 803
bdc674d9 804 for (const char *s = str; s != NULL; s = va_arg (ap, const char *))
ee2752ea
UD
805 @{
806 size_t len = strlen (s);
807
808 /* @r{Resize the allocated memory if necessary.} */
bdc674d9 809 if (resultlen + len + 1 > allocated)
ee2752ea 810 @{
bdc674d9
PE
811 allocated += len;
812 newp = reallocarray (result, allocated, 2);
813 allocated *= 2;
ee2752ea
UD
814 if (newp == NULL)
815 @{
816 free (result);
817 return NULL;
818 @}
ee2752ea
UD
819 result = newp;
820 @}
821
bdc674d9
PE
822 memcpy (result + resultlen, s, len);
823 resultlen += len;
ee2752ea
UD
824 @}
825
826 /* @r{Terminate the result string.} */
bdc674d9 827 result[resultlen++] = '\0';
ee2752ea
UD
828
829 /* @r{Resize memory to the optimal size.} */
bdc674d9 830 newp = realloc (result, resultlen);
ee2752ea
UD
831 if (newp != NULL)
832 result = newp;
833
834 va_end (ap);
835 @}
836
837 return result;
838@}
839@end smallexample
840
841With a bit more knowledge about the input strings one could fine-tune
842the memory allocation. The difference we are pointing to here is that
843we don't use @code{strcat} anymore. We always keep track of the length
f0f308c1 844of the current intermediate result so we can save ourselves the search for the
ee2752ea 845end of the string and use @code{mempcpy}. Please note that we also
f0f308c1
RJ
846don't use @code{stpcpy} which might seem more natural since we are handling
847strings. But this is not necessary since we already know the
ee2752ea 848length of the string and therefore can use the faster memory copying
8a2f1f5b 849function. The example would work for wide characters the same way.
ee2752ea
UD
850
851Whenever a programmer feels the need to use @code{strcat} she or he
f0f308c1 852should think twice and look through the program to see whether the code cannot
1fb22592 853be rewritten to take advantage of already calculated results.
d2fda60e
PE
854The related functions @code{strlcat}, @code{strncat},
855@code{wcscat} and @code{wcsncat}
1fb22592
PE
856are almost always unnecessary, too.
857Again: it is almost always unnecessary to use functions like @code{strcat}.
ee2752ea 858
0a13c9e9
PE
859@node Truncating Strings
860@section Truncating Strings while Copying
861@cindex truncating strings
862@cindex string truncation
863
864The functions described in this section copy or concatenate the
865possibly-truncated contents of a string or array to another, and
866similarly for wide strings. They follow the string-copying functions
867in their header conventions. @xref{Copying Strings and Arrays}. The
868@samp{str} functions are declared in the header file @file{string.h}
869and the @samp{wc} functions are declared in the file @file{wchar.h}.
870
1fb22592
PE
871As noted below, these functions are problematic as their callers may
872have truncation-related bugs and performance issues.
873
0a13c9e9 874@deftypefun {char *} strncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
a448ee41 875@standards{C90, string.h}
0a13c9e9
PE
876@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
877This function is similar to @code{strcpy} but always copies exactly
878@var{size} bytes into @var{to}.
879
880If @var{from} does not contain a null byte in its first @var{size}
881bytes, @code{strncpy} copies just the first @var{size} bytes. In this
882case no null terminator is written into @var{to}.
883
884Otherwise @var{from} must be a string with length less than
885@var{size}. In this case @code{strncpy} copies all of @var{from},
886followed by enough null bytes to add up to @var{size} bytes in all.
887
888The behavior of @code{strncpy} is undefined if the strings overlap.
889
890This function was designed for now-rarely-used arrays consisting of
891non-null bytes followed by zero or more null bytes. It needs to set
892all @var{size} bytes of the destination, even when @var{size} is much
893greater than the length of @var{from}. As noted below, this function
1fb22592 894is generally a poor choice for processing strings.
0a13c9e9
PE
895@end deftypefun
896
0a13c9e9 897@deftypefun {wchar_t *} wcsncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
d08a7e4c 898@standards{ISO, wchar.h}
0a13c9e9
PE
899@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
900This function is similar to @code{wcscpy} but always copies exactly
901@var{size} wide characters into @var{wto}.
902
903If @var{wfrom} does not contain a null wide character in its first
904@var{size} wide characters, then @code{wcsncpy} copies just the first
905@var{size} wide characters. In this case no null terminator is
906written into @var{wto}.
907
908Otherwise @var{wfrom} must be a wide string with length less than
909@var{size}. In this case @code{wcsncpy} copies all of @var{wfrom},
910followed by enough null wide characters to add up to @var{size} wide
911characters in all.
912
913The behavior of @code{wcsncpy} is undefined if the strings overlap.
914
915This function is the wide-character counterpart of @code{strncpy} and
916suffers from most of the problems that @code{strncpy} does. For
917example, as noted below, this function is generally a poor choice for
1fb22592 918processing strings.
0a13c9e9
PE
919@end deftypefun
920
0a13c9e9 921@deftypefun {char *} strndup (const char *@var{s}, size_t @var{size})
d08a7e4c 922@standards{GNU, string.h}
0a13c9e9
PE
923@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
924This function is similar to @code{strdup} but always copies at most
925@var{size} bytes into the newly allocated string.
926
927If the length of @var{s} is more than @var{size}, then @code{strndup}
928copies just the first @var{size} bytes and adds a closing null byte.
929Otherwise all bytes are copied and the string is terminated.
930
931This function differs from @code{strncpy} in that it always terminates
932the destination string.
933
934As noted below, this function is generally a poor choice for
1fb22592 935processing strings.
0a13c9e9
PE
936
937@code{strndup} is a GNU extension.
938@end deftypefun
939
0a13c9e9 940@deftypefn {Macro} {char *} strndupa (const char *@var{s}, size_t @var{size})
d08a7e4c 941@standards{GNU, string.h}
0a13c9e9
PE
942@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
943This function is similar to @code{strndup} but like @code{strdupa} it
944allocates the new string using @code{alloca} @pxref{Variable Size
945Automatic}. The same advantages and limitations of @code{strdupa} are
946valid for @code{strndupa}, too.
947
948This function is implemented only as a macro, just like @code{strdupa}.
949Just as @code{strdupa} this macro also must not be used inside the
950parameter list in a function call.
951
952As noted below, this function is generally a poor choice for
1fb22592 953processing strings.
0a13c9e9
PE
954
955@code{strndupa} is only available if GNU CC is used.
956@end deftypefn
957
0a13c9e9 958@deftypefun {char *} stpncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
d08a7e4c 959@standards{GNU, string.h}
0a13c9e9
PE
960@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
961This function is similar to @code{stpcpy} but copies always exactly
962@var{size} bytes into @var{to}.
963
964If the length of @var{from} is more than @var{size}, then @code{stpncpy}
965copies just the first @var{size} bytes and returns a pointer to the
966byte directly following the one which was copied last. Note that in
967this case there is no null terminator written into @var{to}.
968
969If the length of @var{from} is less than @var{size}, then @code{stpncpy}
970copies all of @var{from}, followed by enough null bytes to add up
971to @var{size} bytes in all. This behavior is rarely useful, but it
972is implemented to be useful in contexts where this behavior of the
973@code{strncpy} is used. @code{stpncpy} returns a pointer to the
974@emph{first} written null byte.
975
976This function is not part of ISO or POSIX but was found useful while
977developing @theglibc{} itself.
978
979Its behavior is undefined if the strings overlap. The function is
980declared in @file{string.h}.
981
982As noted below, this function is generally a poor choice for
1fb22592 983processing strings.
0a13c9e9
PE
984@end deftypefun
985
0a13c9e9 986@deftypefun {wchar_t *} wcpncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
d08a7e4c 987@standards{GNU, wchar.h}
0a13c9e9
PE
988@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
989This function is similar to @code{wcpcpy} but copies always exactly
990@var{wsize} wide characters into @var{wto}.
991
992If the length of @var{wfrom} is more than @var{size}, then
993@code{wcpncpy} copies just the first @var{size} wide characters and
994returns a pointer to the wide character directly following the last
995non-null wide character which was copied last. Note that in this case
996there is no null terminator written into @var{wto}.
997
998If the length of @var{wfrom} is less than @var{size}, then @code{wcpncpy}
999copies all of @var{wfrom}, followed by enough null wide characters to add up
1000to @var{size} wide characters in all. This behavior is rarely useful, but it
1001is implemented to be useful in contexts where this behavior of the
1002@code{wcsncpy} is used. @code{wcpncpy} returns a pointer to the
1003@emph{first} written null wide character.
1004
1005This function is not part of ISO or POSIX but was found useful while
1006developing @theglibc{} itself.
1007
1008Its behavior is undefined if the strings overlap.
1009
1010As noted below, this function is generally a poor choice for
1fb22592 1011processing strings.
0a13c9e9
PE
1012
1013@code{wcpncpy} is a GNU extension.
1014@end deftypefun
1015
8a2f1f5b 1016@deftypefun {char *} strncat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
d08a7e4c 1017@standards{ISO, string.h}
11087373 1018@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4 1019This function is like @code{strcat} except that not more than @var{size}
2cc4b9cc
PE
1020bytes from @var{from} are appended to the end of @var{to}, and
1021@var{from} need not be null-terminated. A single null byte is also
1022always appended to @var{to}, so the total
28f540f4
RM
1023allocated size of @var{to} must be at least @code{@var{size} + 1} bytes
1024longer than its initial length.
1025
1026The @code{strncat} function could be implemented like this:
1027
1028@smallexample
1029@group
1030char *
1031strncat (char *to, const char *from, size_t size)
1032@{
5d1d4918
PE
1033 size_t len = strlen (to);
1034 memcpy (to + len, from, strnlen (from, size));
1035 to[len + strnlen (from, size)] = '\0';
28f540f4
RM
1036 return to;
1037@}
1038@end group
1039@end smallexample
1040
1041The behavior of @code{strncat} is undefined if the strings overlap.
0a13c9e9
PE
1042
1043As a companion to @code{strncpy}, @code{strncat} was designed for
1044now-rarely-used arrays consisting of non-null bytes followed by zero
dff8da6b 1045or more null bytes. However, As noted below, this function is generally a poor
1fb22592 1046choice for processing strings. Also, this function has significant
0a13c9e9 1047performance issues. @xref{Concatenating Strings}.
28f540f4
RM
1048@end deftypefun
1049
8a2f1f5b 1050@deftypefun {wchar_t *} wcsncat (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
d08a7e4c 1051@standards{ISO, wchar.h}
11087373 1052@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b 1053This function is like @code{wcscat} except that not more than @var{size}
2cc4b9cc
PE
1054wide characters from @var{from} are appended to the end of @var{to},
1055and @var{from} need not be null-terminated. A single null wide
1056character is also always appended to @var{to}, so the total allocated
1057size of @var{to} must be at least @code{wcsnlen (@var{wfrom},
1058@var{size}) + 1} wide characters longer than its initial length.
8a2f1f5b
UD
1059
1060The @code{wcsncat} function could be implemented like this:
1061
1062@smallexample
1063@group
1064wchar_t *
1065wcsncat (wchar_t *restrict wto, const wchar_t *restrict wfrom,
1066 size_t size)
1067@{
5d1d4918
PE
1068 size_t len = wcslen (wto);
1069 memcpy (wto + len, wfrom, wcsnlen (wfrom, size) * sizeof (wchar_t));
1070 wto[len + wcsnlen (wfrom, size)] = L'\0';
8a2f1f5b
UD
1071 return wto;
1072@}
1073@end group
1074@end smallexample
1075
1076The behavior of @code{wcsncat} is undefined if the strings overlap.
28f540f4 1077
0a13c9e9 1078As noted below, this function is generally a poor choice for
1fb22592 1079processing strings. Also, this function has significant performance
0a13c9e9
PE
1080issues. @xref{Concatenating Strings}.
1081@end deftypefun
1082
d2fda60e
PE
1083@deftypefun size_t strlcpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
1084@standards{BSD, string.h}
1085@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1086This function copies the string @var{from} to the destination array
1087@var{to}, limiting the result's size (including the null terminator)
1088to @var{size}. The caller should ensure that @var{size} includes room
1089for the result's terminating null byte.
1090
1091If @var{size} is greater than the length of the string @var{from},
1092this function copies the non-null bytes of the string
1093@var{from} to the destination array @var{to},
1094and terminates the copy with a null byte. Like other
1095string functions such as @code{strcpy}, but unlike @code{strncpy}, any
1096remaining bytes in the destination array remain unchanged.
1097
1098If @var{size} is nonzero and less than or equal to the the length of the string
1099@var{from}, this function copies only the first @samp{@var{size} - 1}
1100bytes to the destination array @var{to}, and writes a terminating null
1101byte to the last byte of the array.
1102
1103This function returns the length of the string @var{from}. This means
1104that truncation occurs if and only if the returned value is greater
1105than or equal to @var{size}.
1106
1107The behavior is undefined if @var{to} or @var{from} is a null pointer,
1108or if the destination array's size is less than @var{size}, or if the
1109string @var{from} overlaps the first @var{size} bytes of the
1110destination array.
1111
1112As noted below, this function is generally a poor choice for
1113processing strings. Also, this function has a performance issue,
1114as its time cost is proportional to the length of @var{from}
1115even when @var{size} is small.
1116
1117This function is derived from OpenBSD 2.4.
1118@end deftypefun
1119
1120@deftypefun size_t wcslcpy (wchar_t *restrict @var{to}, const wchar_t *restrict @var{from}, size_t @var{size})
1121@standards{BSD, string.h}
1122@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1123This function is a variant of @code{strlcpy} for wide strings.
1124The @var{size} argument counts the length of the destination buffer in
1125wide characters (and not bytes).
1126
1127This function is derived from BSD.
1128@end deftypefun
1129
1130@deftypefun size_t strlcat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
1131@standards{BSD, string.h}
1132@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1133This function appends the string @var{from} to the
1134string @var{to}, limiting the result's total size (including the null
1135terminator) to @var{size}. The caller should ensure that @var{size}
1136includes room for the result's terminating null byte.
1137
1138This function copies as much as possible of the string @var{from} into
1139the array at @var{to} of @var{size} bytes, starting at the terminating
1140null byte of the original string @var{to}. In effect, this appends
1141the string @var{from} to the string @var{to}. Although the resulting
1142string will contain a null terminator, it can be truncated (not all
1143bytes in @var{from} may be copied).
1144
1145This function returns the sum of the original length of @var{to} and
1146the length of @var{from}. This means that truncation occurs if and
1147only if the returned value is greater than or equal to @var{size}.
1148
1149The behavior is undefined if @var{to} or @var{from} is a null pointer,
1150or if the destination array's size is less than @var{size}, or if the
1151destination array does not contain a null byte in its first @var{size}
1152bytes, or if the string @var{from} overlaps the first @var{size} bytes
1153of the destination array.
1154
1155As noted below, this function is generally a poor choice for
1156processing strings. Also, this function has significant performance
1157issues. @xref{Concatenating Strings}.
1158
1159This function is derived from OpenBSD 2.4.
1160@end deftypefun
1161
1162@deftypefun size_t wcslcat (wchar_t *restrict @var{to}, const wchar_t *restrict @var{from}, size_t @var{size})
1163@standards{BSD, string.h}
1164@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1165This function is a variant of @code{strlcat} for wide strings.
1166The @var{size} argument counts the length of the destination buffer in
1167wide characters (and not bytes).
1168
1169This function is derived from BSD.
1170@end deftypefun
1171
0a13c9e9 1172Because these functions can abruptly truncate strings or wide strings,
1fb22592 1173they are generally poor choices for processing them. When copying or
0a13c9e9
PE
1174concatening multibyte strings, they can truncate within a multibyte
1175character so that the result is not a valid multibyte string. When
1176combining or concatenating multibyte or wide strings, they may
1177truncate the output after a combining character, resulting in a
1178corrupted grapheme. They can cause bugs even when processing
1179single-byte strings: for example, when calculating an ASCII-only user
1180name, a truncated name can identify the wrong user.
1181
1182Although some buffer overruns can be prevented by manually replacing
1183calls to copying functions with calls to truncation functions, there
54ae6d81
PE
1184are often easier and safer automatic techniques, such as fortification
1185(@pxref{Source Fortification}) and AddressSanitizer
1186(@pxref{Instrumentation Options,, Program Instrumentation Options, gcc, Using GCC}).
1187Because truncation functions can mask
0a13c9e9
PE
1188application bugs that would otherwise be caught by the automatic
1189techniques, these functions should be used only when the application's
1190underlying logic requires truncation.
1191
1192@strong{Note:} GNU programs should not truncate strings or wide
1193strings to fit arbitrary size limits. @xref{Semantics, , Writing
1194Robust Programs, standards, The GNU Coding Standards}. Instead of
1195string-truncation functions, it is usually better to use dynamic
1196memory allocation (@pxref{Unconstrained Allocation}) and functions
1197such as @code{strdup} or @code{asprintf} to construct strings.
28f540f4 1198
b4012b75 1199@node String/Array Comparison
28f540f4
RM
1200@section String/Array Comparison
1201@cindex comparing strings and arrays
1202@cindex string comparison functions
1203@cindex array comparison functions
1204@cindex predicates on strings
1205@cindex predicates on arrays
1206
1207You can use the functions in this section to perform comparisons on the
1208contents of strings and arrays. As well as checking for equality, these
1209functions can also be used as the ordering functions for sorting
1210operations. @xref{Searching and Sorting}, for an example of this.
1211
1212Unlike most comparison operations in C, the string comparison functions
1213return a nonzero value if the strings are @emph{not} equivalent rather
1214than if they are. The sign of the value indicates the relative ordering
2cc4b9cc 1215of the first part of the strings that are not equivalent: a
28f540f4 1216negative value indicates that the first string is ``less'' than the
a5113b14 1217second, while a positive value indicates that the first string is
28f540f4
RM
1218``greater''.
1219
1220The most common use of these functions is to check only for equality.
1221This is canonically done with an expression like @w{@samp{! strcmp (s1, s2)}}.
1222
1223All of these functions are declared in the header file @file{string.h}.
1224@pindex string.h
1225
28f540f4 1226@deftypefun int memcmp (const void *@var{a1}, const void *@var{a2}, size_t @var{size})
d08a7e4c 1227@standards{ISO, string.h}
11087373 1228@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
1229The function @code{memcmp} compares the @var{size} bytes of memory
1230beginning at @var{a1} against the @var{size} bytes of memory beginning
1231at @var{a2}. The value returned has the same sign as the difference
1232between the first differing pair of bytes (interpreted as @code{unsigned
1233char} objects, then promoted to @code{int}).
1234
1235If the contents of the two blocks are equal, @code{memcmp} returns
1236@code{0}.
1237@end deftypefun
1238
8a2f1f5b 1239@deftypefun int wmemcmp (const wchar_t *@var{a1}, const wchar_t *@var{a2}, size_t @var{size})
d08a7e4c 1240@standards{ISO, wchar.h}
11087373 1241@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
1242The function @code{wmemcmp} compares the @var{size} wide characters
1243beginning at @var{a1} against the @var{size} wide characters beginning
1244at @var{a2}. The value returned is smaller than or larger than zero
1245depending on whether the first differing wide character is @var{a1} is
2cc4b9cc 1246smaller or larger than the corresponding wide character in @var{a2}.
8a2f1f5b
UD
1247
1248If the contents of the two blocks are equal, @code{wmemcmp} returns
1249@code{0}.
1250@end deftypefun
1251
28f540f4
RM
1252On arbitrary arrays, the @code{memcmp} function is mostly useful for
1253testing equality. It usually isn't meaningful to do byte-wise ordering
1254comparisons on arrays of things other than bytes. For example, a
1255byte-wise comparison on the bytes that make up floating-point numbers
1256isn't likely to tell you anything about the relationship between the
1257values of the floating-point numbers.
1258
8a2f1f5b
UD
1259@code{wmemcmp} is really only useful to compare arrays of type
1260@code{wchar_t} since the function looks at @code{sizeof (wchar_t)} bytes
1261at a time and this number of bytes is system dependent.
1262
28f540f4
RM
1263You should also be careful about using @code{memcmp} to compare objects
1264that can contain ``holes'', such as the padding inserted into structure
1265objects to enforce alignment requirements, extra space at the end of
2cc4b9cc 1266unions, and extra bytes at the ends of strings whose length is less
28f540f4
RM
1267than their allocated size. The contents of these ``holes'' are
1268indeterminate and may cause strange behavior when performing byte-wise
1269comparisons. For more predictable results, perform an explicit
1270component-wise comparison.
1271
1272For example, given a structure type definition like:
1273
1274@smallexample
1275struct foo
1276 @{
1277 unsigned char tag;
1278 union
1279 @{
1280 double f;
1281 long i;
1282 char *p;
1283 @} value;
1284 @};
1285@end smallexample
1286
1287@noindent
1288you are better off writing a specialized comparison function to compare
1289@code{struct foo} objects instead of comparing them with @code{memcmp}.
1290
28f540f4 1291@deftypefun int strcmp (const char *@var{s1}, const char *@var{s2})
d08a7e4c 1292@standards{ISO, string.h}
11087373 1293@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
1294The @code{strcmp} function compares the string @var{s1} against
1295@var{s2}, returning a value that has the same sign as the difference
2cc4b9cc 1296between the first differing pair of bytes (interpreted as
28f540f4
RM
1297@code{unsigned char} objects, then promoted to @code{int}).
1298
1299If the two strings are equal, @code{strcmp} returns @code{0}.
1300
1301A consequence of the ordering used by @code{strcmp} is that if @var{s1}
1302is an initial substring of @var{s2}, then @var{s1} is considered to be
1303``less than'' @var{s2}.
8a2f1f5b
UD
1304
1305@code{strcmp} does not take sorting conventions of the language the
1306strings are written in into account. To get that one has to use
1307@code{strcoll}.
1308@end deftypefun
1309
8a2f1f5b 1310@deftypefun int wcscmp (const wchar_t *@var{ws1}, const wchar_t *@var{ws2})
d08a7e4c 1311@standards{ISO, wchar.h}
11087373 1312@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b 1313
2cc4b9cc 1314The @code{wcscmp} function compares the wide string @var{ws1}
8a2f1f5b
UD
1315against @var{ws2}. The value returned is smaller than or larger than zero
1316depending on whether the first differing wide character is @var{ws1} is
2cc4b9cc 1317smaller or larger than the corresponding wide character in @var{ws2}.
8a2f1f5b
UD
1318
1319If the two strings are equal, @code{wcscmp} returns @code{0}.
1320
1321A consequence of the ordering used by @code{wcscmp} is that if @var{ws1}
1322is an initial substring of @var{ws2}, then @var{ws1} is considered to be
1323``less than'' @var{ws2}.
1324
1325@code{wcscmp} does not take sorting conventions of the language the
1326strings are written in into account. To get that one has to use
1327@code{wcscoll}.
28f540f4
RM
1328@end deftypefun
1329
28f540f4 1330@deftypefun int strcasecmp (const char *@var{s1}, const char *@var{s2})
d08a7e4c 1331@standards{BSD, string.h}
11087373
AO
1332@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
1333@c Although this calls tolower multiple times, it's a macro, and
1334@c strcasecmp is optimized so that the locale pointer is read only once.
1335@c There are some asm implementations too, for which the single-read
1336@c from locale TLS pointers also applies.
4547c1a4 1337This function is like @code{strcmp}, except that differences in case are
2cc4b9cc
PE
1338ignored, and its arguments must be multibyte strings.
1339How uppercase and lowercase characters are related is
4547c1a4
UD
1340determined by the currently selected locale. In the standard @code{"C"}
1341locale the characters @"A and @"a do not match but in a locale which
dd7d45e8 1342regards these characters as parts of the alphabet they do match.
28f540f4 1343
85c165be 1344@noindent
28f540f4
RM
1345@code{strcasecmp} is derived from BSD.
1346@end deftypefun
1347
8ded91fb 1348@deftypefun int wcscasecmp (const wchar_t *@var{ws1}, const wchar_t *@var{ws2})
d08a7e4c 1349@standards{GNU, wchar.h}
11087373
AO
1350@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
1351@c Since towlower is not a macro, the locale object may be read multiple
1352@c times.
8a2f1f5b
UD
1353This function is like @code{wcscmp}, except that differences in case are
1354ignored. How uppercase and lowercase characters are related is
1355determined by the currently selected locale. In the standard @code{"C"}
1356locale the characters @"A and @"a do not match but in a locale which
1357regards these characters as parts of the alphabet they do match.
1358
1359@noindent
1360@code{wcscasecmp} is a GNU extension.
1361@end deftypefun
1362
8a2f1f5b 1363@deftypefun int strncmp (const char *@var{s1}, const char *@var{s2}, size_t @var{size})
d08a7e4c 1364@standards{ISO, string.h}
11087373 1365@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b 1366This function is the similar to @code{strcmp}, except that no more than
2cc4b9cc
PE
1367@var{size} bytes are compared. In other words, if the two
1368strings are the same in their first @var{size} bytes, the
8a2f1f5b
UD
1369return value is zero.
1370@end deftypefun
1371
8a2f1f5b 1372@deftypefun int wcsncmp (const wchar_t *@var{ws1}, const wchar_t *@var{ws2}, size_t @var{size})
d08a7e4c 1373@standards{ISO, wchar.h}
11087373 1374@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
f0f308c1 1375This function is similar to @code{wcscmp}, except that no more than
8a2f1f5b
UD
1376@var{size} wide characters are compared. In other words, if the two
1377strings are the same in their first @var{size} wide characters, the
1378return value is zero.
1379@end deftypefun
1380
28f540f4 1381@deftypefun int strncasecmp (const char *@var{s1}, const char *@var{s2}, size_t @var{n})
d08a7e4c 1382@standards{BSD, string.h}
11087373 1383@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
28f540f4 1384This function is like @code{strncmp}, except that differences in case
2cc4b9cc
PE
1385are ignored, and the compared parts of the arguments should consist of
1386valid multibyte characters.
1387Like @code{strcasecmp}, it is locale dependent how
dd7d45e8 1388uppercase and lowercase characters are related.
28f540f4 1389
85c165be 1390@noindent
28f540f4
RM
1391@code{strncasecmp} is a GNU extension.
1392@end deftypefun
1393
8a2f1f5b 1394@deftypefun int wcsncasecmp (const wchar_t *@var{ws1}, const wchar_t *@var{s2}, size_t @var{n})
d08a7e4c 1395@standards{GNU, wchar.h}
11087373 1396@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
8a2f1f5b
UD
1397This function is like @code{wcsncmp}, except that differences in case
1398are ignored. Like @code{wcscasecmp}, it is locale dependent how
1399uppercase and lowercase characters are related.
1400
1401@noindent
1402@code{wcsncasecmp} is a GNU extension.
28f540f4
RM
1403@end deftypefun
1404
8a2f1f5b
UD
1405Here are some examples showing the use of @code{strcmp} and
1406@code{strncmp} (equivalent examples can be constructed for the wide
1407character functions). These examples assume the use of the ASCII
1408character set. (If some other character set---say, EBCDIC---is used
1409instead, then the glyphs are associated with different numeric codes,
1410and the return values and ordering may differ.)
28f540f4
RM
1411
1412@smallexample
1413strcmp ("hello", "hello")
1414 @result{} 0 /* @r{These two strings are the same.} */
1415strcmp ("hello", "Hello")
1416 @result{} 32 /* @r{Comparisons are case-sensitive.} */
1417strcmp ("hello", "world")
2cc4b9cc 1418 @result{} -15 /* @r{The byte @code{'h'} comes before @code{'w'}.} */
28f540f4 1419strcmp ("hello", "hello, world")
2cc4b9cc 1420 @result{} -44 /* @r{Comparing a null byte against a comma.} */
6952e59e 1421strncmp ("hello", "hello, world", 5)
2cc4b9cc 1422 @result{} 0 /* @r{The initial 5 bytes are the same.} */
28f540f4 1423strncmp ("hello, world", "hello, stupid world!!!", 5)
2cc4b9cc 1424 @result{} 0 /* @r{The initial 5 bytes are the same.} */
28f540f4
RM
1425@end smallexample
1426
1f205a47 1427@deftypefun int strverscmp (const char *@var{s1}, const char *@var{s2})
d08a7e4c 1428@standards{GNU, string.h}
11087373
AO
1429@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
1430@c Calls isdigit multiple times, locale may change in between.
1f205a47 1431The @code{strverscmp} function compares the string @var{s1} against
f2282d42
RM
1432@var{s2}, considering them as holding indices/version numbers. The
1433return value follows the same conventions as found in the
1434@code{strcmp} function. In fact, if @var{s1} and @var{s2} contain no
f4a36548
FW
1435digits, @code{strverscmp} behaves like @code{strcmp}
1436(in the sense that the sign of the result is the same).
1f205a47 1437
f4a36548
FW
1438The comparison algorithm which the @code{strverscmp} function implements
1439differs slightly from other version-comparison algorithms. The
1440implementation is based on a finite-state machine, whose behavior is
1441approximated below.
1f205a47
UD
1442
1443@itemize @bullet
1444@item
f4a36548
FW
1445The input strings are each split into sequences of non-digits and
1446digits. These sequences can be empty at the beginning and end of the
1447string. Digits are determined by the @code{isdigit} function and are
1448thus subject to the current locale.
1f205a47
UD
1449
1450@item
f4a36548
FW
1451Comparison starts with a (possibly empty) non-digit sequence. The first
1452non-equal sequences of non-digits or digits determines the outcome of
1453the comparison.
1f205a47
UD
1454
1455@item
f4a36548
FW
1456Corresponding non-digit sequences in both strings are compared
1457lexicographically if their lengths are equal. If the lengths differ,
1458the shorter non-digit sequence is extended with the input string
1459character immediately following it (which may be the null terminator),
1460the other sequence is truncated to be of the same (extended) length, and
1461these two sequences are compared lexicographically. In the last case,
1462the sequence comparison determines the result of the function because
1463the extension character (or some character before it) is necessarily
1464different from the character at the same offset in the other input
1465string.
1466
1467@item
1468For two sequences of digits, the number of leading zeros is counted (which
1469can be zero). If the count differs, the string with more leading zeros
1470in the digit sequence is considered smaller than the other string.
1471
1472@item
1473If the two sequences of digits have no leading zeros, they are compared
1474as integers, that is, the string with the longer digit sequence is
1475deemed larger, and if both sequences are of equal length, they are
1476compared lexicographically.
1477
1478@item
1479If both digit sequences start with a zero and have an equal number of
1480leading zeros, they are compared lexicographically if their lengths are
1481the same. If the lengths differ, the shorter sequence is extended with
1482the following character in its input string, and the other sequence is
1483truncated to the same length, and both sequences are compared
1484lexicographically (similar to the non-digit sequence case above).
1f205a47
UD
1485@end itemize
1486
f4a36548
FW
1487The treatment of leading zeros and the tie-breaking extension characters
1488(which in effect propagate across non-digit/digit sequence boundaries)
1489differs from other version-comparison algorithms.
1490
1f205a47
UD
1491@smallexample
1492strverscmp ("no digit", "no digit")
0bc93a2f 1493 @result{} 0 /* @r{same behavior as strcmp.} */
1f205a47
UD
1494strverscmp ("item#99", "item#100")
1495 @result{} <0 /* @r{same prefix, but 99 < 100.} */
1496strverscmp ("alpha1", "alpha001")
f4a36548 1497 @result{} >0 /* @r{different number of leading zeros (0 and 2).} */
1f205a47 1498strverscmp ("part1_f012", "part1_f01")
f4a36548 1499 @result{} >0 /* @r{lexicographical comparison with leading zeros.} */
1f205a47 1500strverscmp ("foo.009", "foo.0")
f4a36548 1501 @result{} <0 /* @r{different number of leading zeros (2 and 1).} */
1f205a47
UD
1502@end smallexample
1503
1f205a47
UD
1504@code{strverscmp} is a GNU extension.
1505@end deftypefun
1506
28f540f4 1507@deftypefun int bcmp (const void *@var{a1}, const void *@var{a2}, size_t @var{size})
d08a7e4c 1508@standards{BSD, string.h}
11087373 1509@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
1510This is an obsolete alias for @code{memcmp}, derived from BSD.
1511@end deftypefun
1512
b4012b75 1513@node Collation Functions
28f540f4
RM
1514@section Collation Functions
1515
1516@cindex collating strings
1517@cindex string collation functions
1518
1519In some locales, the conventions for lexicographic ordering differ from
1520the strict numeric ordering of character codes. For example, in Spanish
1521most glyphs with diacritical marks such as accents are not considered
a5177499
BS
1522distinct letters for the purposes of collation. On the other hand, in
1523Czech the two-character sequence @samp{ch} is treated as a single letter
1524that is collated between @samp{h} and @samp{i}.
28f540f4
RM
1525
1526You can use the functions @code{strcoll} and @code{strxfrm} (declared in
8a2f1f5b
UD
1527the headers file @file{string.h}) and @code{wcscoll} and @code{wcsxfrm}
1528(declared in the headers file @file{wchar}) to compare strings using a
1529collation ordering appropriate for the current locale. The locale used
1530by these functions in particular can be specified by setting the locale
1531for the @code{LC_COLLATE} category; see @ref{Locales}.
28f540f4 1532@pindex string.h
8a2f1f5b 1533@pindex wchar.h
28f540f4
RM
1534
1535In the standard C locale, the collation sequence for @code{strcoll} is
8a2f1f5b
UD
1536the same as that for @code{strcmp}. Similarly, @code{wcscoll} and
1537@code{wcscmp} are the same in this situation.
28f540f4
RM
1538
1539Effectively, the way these functions work is by applying a mapping to
2cc4b9cc
PE
1540transform the characters in a multibyte string to a byte
1541sequence that represents
28f540f4
RM
1542the string's position in the collating sequence of the current locale.
1543Comparing two such byte sequences in a simple fashion is equivalent to
1544comparing the strings with the locale's collating sequence.
1545
8a2f1f5b
UD
1546The functions @code{strcoll} and @code{wcscoll} perform this translation
1547implicitly, in order to do one comparison. By contrast, @code{strxfrm}
1548and @code{wcsxfrm} perform the mapping explicitly. If you are making
1549multiple comparisons using the same string or set of strings, it is
1550likely to be more efficient to use @code{strxfrm} or @code{wcsxfrm} to
1551transform all the strings just once, and subsequently compare the
1552transformed strings with @code{strcmp} or @code{wcscmp}.
28f540f4 1553
28f540f4 1554@deftypefun int strcoll (const char *@var{s1}, const char *@var{s2})
d08a7e4c 1555@standards{ISO, string.h}
11087373
AO
1556@safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
1557@c Calls strcoll_l with the current locale, which dereferences only the
1558@c LC_COLLATE data pointer.
28f540f4
RM
1559The @code{strcoll} function is similar to @code{strcmp} but uses the
1560collating sequence of the current locale for collation (the
2cc4b9cc 1561@code{LC_COLLATE} locale). The arguments are multibyte strings.
28f540f4
RM
1562@end deftypefun
1563
8a2f1f5b 1564@deftypefun int wcscoll (const wchar_t *@var{ws1}, const wchar_t *@var{ws2})
d08a7e4c 1565@standards{ISO, wchar.h}
11087373
AO
1566@safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
1567@c Same as strcoll, but calling wcscoll_l.
8a2f1f5b
UD
1568The @code{wcscoll} function is similar to @code{wcscmp} but uses the
1569collating sequence of the current locale for collation (the
1570@code{LC_COLLATE} locale).
1571@end deftypefun
1572
28f540f4
RM
1573Here is an example of sorting an array of strings, using @code{strcoll}
1574to compare them. The actual sort algorithm is not written here; it
1575comes from @code{qsort} (@pxref{Array Sort Function}). The job of the
1576code shown here is to say how to compare the strings while sorting them.
1577(Later on in this section, we will show a way to do this more
1578efficiently using @code{strxfrm}.)
1579
1580@smallexample
1581/* @r{This is the comparison function used with @code{qsort}.} */
1582
1583int
e39745ff 1584compare_elements (const void *v1, const void *v2)
28f540f4 1585@{
e39745ff 1586 char * const *p1 = v1;
a9f5ce09 1587 char * const *p2 = v2;
e39745ff 1588
28f540f4
RM
1589 return strcoll (*p1, *p2);
1590@}
1591
1592/* @r{This is the entry point---the function to sort}
1593 @r{strings using the locale's collating sequence.} */
1594
1595void
1596sort_strings (char **array, int nstrings)
1597@{
1598 /* @r{Sort @code{temp_array} by comparing the strings.} */
9fc19e48
UD
1599 qsort (array, nstrings,
1600 sizeof (char *), compare_elements);
28f540f4
RM
1601@}
1602@end smallexample
1603
1604@cindex converting string to collation order
8a2f1f5b 1605@deftypefun size_t strxfrm (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
d08a7e4c 1606@standards{ISO, string.h}
11087373 1607@safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2cc4b9cc
PE
1608The function @code{strxfrm} transforms the multibyte string
1609@var{from} using the
8a2f1f5b 1610collation transformation determined by the locale currently selected for
28f540f4 1611collation, and stores the transformed string in the array @var{to}. Up
2cc4b9cc 1612to @var{size} bytes (including a terminating null byte) are
28f540f4
RM
1613stored.
1614
1615The behavior is undefined if the strings @var{to} and @var{from}
0a13c9e9 1616overlap; see @ref{Copying Strings and Arrays}.
28f540f4
RM
1617
1618The return value is the length of the entire transformed string. This
1619value is not affected by the value of @var{size}, but if it is greater
a5113b14
UD
1620or equal than @var{size}, it means that the transformed string did not
1621entirely fit in the array @var{to}. In this case, only as much of the
1622string as actually fits was stored. To get the whole transformed
1623string, call @code{strxfrm} again with a bigger output array.
28f540f4
RM
1624
1625The transformed string may be longer than the original string, and it
1626may also be shorter.
1627
2cc4b9cc
PE
1628If @var{size} is zero, no bytes are stored in @var{to}. In this
1629case, @code{strxfrm} simply returns the number of bytes that would
28f540f4 1630be the length of the transformed string. This is useful for determining
8a2f1f5b
UD
1631what size the allocated array should be. It does not matter what
1632@var{to} is if @var{size} is zero; @var{to} may even be a null pointer.
1633@end deftypefun
1634
8a2f1f5b 1635@deftypefun size_t wcsxfrm (wchar_t *restrict @var{wto}, const wchar_t *@var{wfrom}, size_t @var{size})
d08a7e4c 1636@standards{ISO, wchar.h}
11087373 1637@safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2cc4b9cc 1638The function @code{wcsxfrm} transforms wide string @var{wfrom}
8a2f1f5b
UD
1639using the collation transformation determined by the locale currently
1640selected for collation, and stores the transformed string in the array
1641@var{wto}. Up to @var{size} wide characters (including a terminating null
2cc4b9cc 1642wide character) are stored.
8a2f1f5b
UD
1643
1644The behavior is undefined if the strings @var{wto} and @var{wfrom}
0a13c9e9 1645overlap; see @ref{Copying Strings and Arrays}.
8a2f1f5b 1646
2cc4b9cc 1647The return value is the length of the entire transformed wide
8a2f1f5b
UD
1648string. This value is not affected by the value of @var{size}, but if
1649it is greater or equal than @var{size}, it means that the transformed
2cc4b9cc
PE
1650wide string did not entirely fit in the array @var{wto}. In
1651this case, only as much of the wide string as actually fits
1652was stored. To get the whole transformed wide string, call
8a2f1f5b
UD
1653@code{wcsxfrm} again with a bigger output array.
1654
2cc4b9cc
PE
1655The transformed wide string may be longer than the original
1656wide string, and it may also be shorter.
8a2f1f5b 1657
2cc4b9cc 1658If @var{size} is zero, no wide characters are stored in @var{to}. In this
8a2f1f5b 1659case, @code{wcsxfrm} simply returns the number of wide characters that
2cc4b9cc 1660would be the length of the transformed wide string. This is
8a2f1f5b
UD
1661useful for determining what size the allocated array should be (remember
1662to multiply with @code{sizeof (wchar_t)}). It does not matter what
1663@var{wto} is if @var{size} is zero; @var{wto} may even be a null pointer.
28f540f4
RM
1664@end deftypefun
1665
1666Here is an example of how you can use @code{strxfrm} when
1667you plan to do many comparisons. It does the same thing as the previous
1668example, but much faster, because it has to transform each string only
1669once, no matter how many times it is compared with other strings. Even
1670the time needed to allocate and free storage is much less than the time
1671we save, when there are many strings.
1672
1673@smallexample
1674struct sorter @{ char *input; char *transformed; @};
1675
1676/* @r{This is the comparison function used with @code{qsort}}
1677 @r{to sort an array of @code{struct sorter}.} */
1678
1679int
e39745ff 1680compare_elements (const void *v1, const void *v2)
28f540f4 1681@{
e39745ff
AJ
1682 const struct sorter *p1 = v1;
1683 const struct sorter *p2 = v2;
1684
28f540f4
RM
1685 return strcmp (p1->transformed, p2->transformed);
1686@}
1687
1688/* @r{This is the entry point---the function to sort}
1689 @r{strings using the locale's collating sequence.} */
1690
1691void
1692sort_strings_fast (char **array, int nstrings)
1693@{
1694 struct sorter temp_array[nstrings];
1695 int i;
1696
1697 /* @r{Set up @code{temp_array}. Each element contains}
1698 @r{one input string and its transformed string.} */
1699 for (i = 0; i < nstrings; i++)
1700 @{
1701 size_t length = strlen (array[i]) * 2;
a5113b14 1702 char *transformed;
f2ea0f5b 1703 size_t transformed_length;
28f540f4
RM
1704
1705 temp_array[i].input = array[i];
1706
a5113b14
UD
1707 /* @r{First try a buffer perhaps big enough.} */
1708 transformed = (char *) xmalloc (length);
1709
1710 /* @r{Transform @code{array[i]}.} */
1711 transformed_length = strxfrm (transformed, array[i], length);
1712
1713 /* @r{If the buffer was not large enough, resize it}
1714 @r{and try again.} */
1715 if (transformed_length >= length)
28f540f4 1716 @{
a5113b14 1717 /* @r{Allocate the needed space. +1 for terminating}
2cc4b9cc 1718 @r{@code{'\0'} byte.} */
bdc674d9
PE
1719 transformed = xrealloc (transformed,
1720 transformed_length + 1);
a5113b14
UD
1721
1722 /* @r{The return value is not interesting because we know}
1723 @r{how long the transformed string is.} */
dd7d45e8
UD
1724 (void) strxfrm (transformed, array[i],
1725 transformed_length + 1);
28f540f4 1726 @}
a5113b14
UD
1727
1728 temp_array[i].transformed = transformed;
28f540f4
RM
1729 @}
1730
1731 /* @r{Sort @code{temp_array} by comparing transformed strings.} */
89e691f2
AM
1732 qsort (temp_array, nstrings,
1733 sizeof (struct sorter), compare_elements);
28f540f4
RM
1734
1735 /* @r{Put the elements back in the permanent array}
1736 @r{in their sorted order.} */
1737 for (i = 0; i < nstrings; i++)
1738 array[i] = temp_array[i].input;
1739
1740 /* @r{Free the strings we allocated.} */
1741 for (i = 0; i < nstrings; i++)
1742 free (temp_array[i].transformed);
1743@}
1744@end smallexample
1745
8a2f1f5b
UD
1746The interesting part of this code for the wide character version would
1747look like this:
1748
1749@smallexample
1750void
1751sort_strings_fast (wchar_t **array, int nstrings)
1752@{
1753 @dots{}
1754 /* @r{Transform @code{array[i]}.} */
1755 transformed_length = wcsxfrm (transformed, array[i], length);
1756
1757 /* @r{If the buffer was not large enough, resize it}
1758 @r{and try again.} */
1759 if (transformed_length >= length)
1760 @{
1761 /* @r{Allocate the needed space. +1 for terminating}
2cc4b9cc 1762 @r{@code{L'\0'} wide character.} */
bdc674d9
PE
1763 transformed = xreallocarray (transformed,
1764 transformed_length + 1,
1765 sizeof *transformed);
8a2f1f5b
UD
1766
1767 /* @r{The return value is not interesting because we know}
1768 @r{how long the transformed string is.} */
1769 (void) wcsxfrm (transformed, array[i],
1770 transformed_length + 1);
1771 @}
1772 @dots{}
1773@end smallexample
1774
1775@noindent
1776Note the additional multiplication with @code{sizeof (wchar_t)} in the
1777@code{realloc} call.
1778
1779@strong{Compatibility Note:} The string collation functions are a new
976780fd 1780feature of @w{ISO C90}. Older C dialects have no equivalent feature.
8a2f1f5b
UD
1781The wide character versions were introduced in @w{Amendment 1} to @w{ISO
1782C90}.
28f540f4 1783
b4012b75 1784@node Search Functions
28f540f4
RM
1785@section Search Functions
1786
1787This section describes library functions which perform various kinds
1788of searching operations on strings and arrays. These functions are
1789declared in the header file @file{string.h}.
1790@pindex string.h
1791@cindex search functions (for strings)
1792@cindex string search functions
1793
28f540f4 1794@deftypefun {void *} memchr (const void *@var{block}, int @var{c}, size_t @var{size})
d08a7e4c 1795@standards{ISO, string.h}
11087373 1796@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
1797This function finds the first occurrence of the byte @var{c} (converted
1798to an @code{unsigned char}) in the initial @var{size} bytes of the
1799object beginning at @var{block}. The return value is a pointer to the
1800located byte, or a null pointer if no match was found.
1801@end deftypefun
1802
8a2f1f5b 1803@deftypefun {wchar_t *} wmemchr (const wchar_t *@var{block}, wchar_t @var{wc}, size_t @var{size})
d08a7e4c 1804@standards{ISO, wchar.h}
11087373 1805@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
1806This function finds the first occurrence of the wide character @var{wc}
1807in the initial @var{size} wide characters of the object beginning at
1808@var{block}. The return value is a pointer to the located wide
1809character, or a null pointer if no match was found.
1810@end deftypefun
1811
87b56f36 1812@deftypefun {void *} rawmemchr (const void *@var{block}, int @var{c})
d08a7e4c 1813@standards{GNU, string.h}
11087373 1814@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
87b56f36
UD
1815Often the @code{memchr} function is used with the knowledge that the
1816byte @var{c} is available in the memory block specified by the
1817parameters. But this means that the @var{size} parameter is not really
1818needed and that the tests performed with it at runtime (to check whether
1819the end of the block is reached) are not needed.
1820
1821The @code{rawmemchr} function exists for just this situation which is
1822surprisingly frequent. The interface is similar to @code{memchr} except
1823that the @var{size} parameter is missing. The function will look beyond
1824the end of the block pointed to by @var{block} in case the programmer
6be569a4 1825made an error in assuming that the byte @var{c} is present in the block.
87b56f36
UD
1826In this case the result is unspecified. Otherwise the return value is a
1827pointer to the located byte.
1828
32c7acd4 1829When looking for the end of a string, use @code{strchr}.
87b56f36
UD
1830
1831This function is a GNU extension.
1832@end deftypefun
1833
ca747856 1834@deftypefun {void *} memrchr (const void *@var{block}, int @var{c}, size_t @var{size})
d08a7e4c 1835@standards{GNU, string.h}
11087373 1836@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
ca747856
RM
1837The function @code{memrchr} is like @code{memchr}, except that it searches
1838backwards from the end of the block defined by @var{block} and @var{size}
1839(instead of forwards from the front).
4efcb713
UD
1840
1841This function is a GNU extension.
a2d63612 1842@end deftypefun
ca747856 1843
28f540f4 1844@deftypefun {char *} strchr (const char *@var{string}, int @var{c})
d08a7e4c 1845@standards{ISO, string.h}
11087373 1846@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2cc4b9cc
PE
1847The @code{strchr} function finds the first occurrence of the byte
1848@var{c} (converted to a @code{char}) in the string
28f540f4 1849beginning at @var{string}. The return value is a pointer to the located
2cc4b9cc 1850byte, or a null pointer if no match was found.
28f540f4
RM
1851
1852For example,
1853@smallexample
1854strchr ("hello, world", 'l')
1855 @result{} "llo, world"
1856strchr ("hello, world", '?')
1857 @result{} NULL
a5113b14 1858@end smallexample
28f540f4 1859
2cc4b9cc 1860The terminating null byte is considered to be part of the string,
28f540f4 1861so you can use this function get a pointer to the end of a string by
2cc4b9cc 1862specifying zero as the value of the @var{c} argument.
0520adde
FB
1863
1864When @code{strchr} returns a null pointer, it does not let you know
2cc4b9cc 1865the position of the terminating null byte it has found. If you
0520adde
FB
1866need that information, it is better (but less portable) to use
1867@code{strchrnul} than to search for it a second time.
8a2f1f5b
UD
1868@end deftypefun
1869
f801cf7b 1870@deftypefun {wchar_t *} wcschr (const wchar_t *@var{wstring}, wchar_t @var{wc})
d08a7e4c 1871@standards{ISO, wchar.h}
11087373 1872@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b 1873The @code{wcschr} function finds the first occurrence of the wide
2cc4b9cc 1874character @var{wc} in the wide string
8a2f1f5b
UD
1875beginning at @var{wstring}. The return value is a pointer to the
1876located wide character, or a null pointer if no match was found.
1877
2cc4b9cc
PE
1878The terminating null wide character is considered to be part of the wide
1879string, so you can use this function get a pointer to the end
1880of a wide string by specifying a null wide character as the
8a2f1f5b
UD
1881value of the @var{wc} argument. It would be better (but less portable)
1882to use @code{wcschrnul} in this case, though.
28f540f4
RM
1883@end deftypefun
1884
0e4ee106 1885@deftypefun {char *} strchrnul (const char *@var{string}, int @var{c})
d08a7e4c 1886@standards{GNU, string.h}
11087373 1887@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
0e4ee106 1888@code{strchrnul} is the same as @code{strchr} except that if it does
2cc4b9cc
PE
1889not find the byte, it returns a pointer to string's terminating
1890null byte rather than a null pointer.
8a2f1f5b
UD
1891
1892This function is a GNU extension.
1893@end deftypefun
1894
8a2f1f5b 1895@deftypefun {wchar_t *} wcschrnul (const wchar_t *@var{wstring}, wchar_t @var{wc})
d08a7e4c 1896@standards{GNU, wchar.h}
11087373 1897@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b 1898@code{wcschrnul} is the same as @code{wcschr} except that if it does not
2cc4b9cc 1899find the wide character, it returns a pointer to the wide string's
8a2f1f5b
UD
1900terminating null wide character rather than a null pointer.
1901
1902This function is a GNU extension.
28f540f4
RM
1903@end deftypefun
1904
ec28fc7c 1905One useful, but unusual, use of the @code{strchr}
2cc4b9cc 1906function is when one wants to have a pointer pointing to the null byte
ee2752ea
UD
1907terminating a string. This is often written in this way:
1908
1909@smallexample
1910 s += strlen (s);
1911@end smallexample
1912
1913@noindent
1914This is almost optimal but the addition operation duplicated a bit of
1915the work already done in the @code{strlen} function. A better solution
1916is this:
1917
1918@smallexample
1919 s = strchr (s, '\0');
1920@end smallexample
1921
1922There is no restriction on the second parameter of @code{strchr} so it
2cc4b9cc 1923could very well also be zero. Those readers thinking very
ee2752ea 1924hard about this might now point out that the @code{strchr} function is
8c474db5 1925more expensive than the @code{strlen} function since we have two abort
1f77f049 1926criteria. This is right. But in @theglibc{} the implementation of
0e4ee106 1927@code{strchr} is optimized in a special way so that @code{strchr}
8c474db5 1928actually is faster.
ee2752ea 1929
28f540f4 1930@deftypefun {char *} strrchr (const char *@var{string}, int @var{c})
d08a7e4c 1931@standards{ISO, string.h}
11087373 1932@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
1933The function @code{strrchr} is like @code{strchr}, except that it searches
1934backwards from the end of the string @var{string} (instead of forwards
1935from the front).
1936
1937For example,
1938@smallexample
1939strrchr ("hello, world", 'l')
1940 @result{} "ld"
1941@end smallexample
1942@end deftypefun
1943
4315f45c 1944@deftypefun {wchar_t *} wcsrchr (const wchar_t *@var{wstring}, wchar_t @var{wc})
d08a7e4c 1945@standards{ISO, wchar.h}
11087373 1946@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
1947The function @code{wcsrchr} is like @code{wcschr}, except that it searches
1948backwards from the end of the string @var{wstring} (instead of forwards
1949from the front).
1950@end deftypefun
1951
28f540f4 1952@deftypefun {char *} strstr (const char *@var{haystack}, const char *@var{needle})
d08a7e4c 1953@standards{ISO, string.h}
11087373 1954@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4 1955This is like @code{strchr}, except that it searches @var{haystack} for a
2cc4b9cc 1956substring @var{needle} rather than just a single byte. It
28f540f4 1957returns a pointer into the string @var{haystack} that is the first
2cc4b9cc 1958byte of the substring, or a null pointer if no match was found. If
28f540f4
RM
1959@var{needle} is an empty string, the function returns @var{haystack}.
1960
1961For example,
1962@smallexample
1963strstr ("hello, world", "l")
1964 @result{} "llo, world"
1965strstr ("hello, world", "wo")
1966 @result{} "world"
1967@end smallexample
1968@end deftypefun
1969
8a2f1f5b 1970@deftypefun {wchar_t *} wcsstr (const wchar_t *@var{haystack}, const wchar_t *@var{needle})
d08a7e4c 1971@standards{ISO, wchar.h}
11087373 1972@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
1973This is like @code{wcschr}, except that it searches @var{haystack} for a
1974substring @var{needle} rather than just a single wide character. It
1975returns a pointer into the string @var{haystack} that is the first wide
1976character of the substring, or a null pointer if no match was found. If
1977@var{needle} is an empty string, the function returns @var{haystack}.
1978@end deftypefun
1979
8a2f1f5b 1980@deftypefun {wchar_t *} wcswcs (const wchar_t *@var{haystack}, const wchar_t *@var{needle})
d08a7e4c 1981@standards{XPG, wchar.h}
11087373 1982@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
9dcc8f11 1983@code{wcswcs} is a deprecated alias for @code{wcsstr}. This is the
8a2f1f5b
UD
1984name originally used in the X/Open Portability Guide before the
1985@w{Amendment 1} to @w{ISO C90} was published.
1986@end deftypefun
1987
28f540f4 1988
0e4ee106 1989@deftypefun {char *} strcasestr (const char *@var{haystack}, const char *@var{needle})
d08a7e4c 1990@standards{GNU, string.h}
11087373
AO
1991@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
1992@c There may be multiple calls of strncasecmp, each accessing the locale
1993@c object independently.
0e4ee106
UD
1994This is like @code{strstr}, except that it ignores case in searching for
1995the substring. Like @code{strcasecmp}, it is locale dependent how
2cc4b9cc
PE
1996uppercase and lowercase characters are related, and arguments are
1997multibyte strings.
0e4ee106
UD
1998
1999
2000For example,
2001@smallexample
d6868416 2002strcasestr ("hello, world", "L")
0e4ee106 2003 @result{} "llo, world"
d6868416 2004strcasestr ("hello, World", "wo")
0e4ee106
UD
2005 @result{} "World"
2006@end smallexample
2007@end deftypefun
2008
2009
63551311 2010@deftypefun {void *} memmem (const void *@var{haystack}, size_t @var{haystack-len},@*const void *@var{needle}, size_t @var{needle-len})
d08a7e4c 2011@standards{GNU, string.h}
11087373 2012@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4 2013This is like @code{strstr}, but @var{needle} and @var{haystack} are byte
2cc4b9cc 2014arrays rather than strings. @var{needle-len} is the
28f540f4 2015length of @var{needle} and @var{haystack-len} is the length of
0005e54f 2016@var{haystack}.
28f540f4
RM
2017
2018This function is a GNU extension.
2019@end deftypefun
2020
28f540f4 2021@deftypefun size_t strspn (const char *@var{string}, const char *@var{skipset})
d08a7e4c 2022@standards{ISO, string.h}
11087373 2023@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4 2024The @code{strspn} (``string span'') function returns the length of the
2cc4b9cc 2025initial substring of @var{string} that consists entirely of bytes that
28f540f4 2026are members of the set specified by the string @var{skipset}. The order
2cc4b9cc 2027of the bytes in @var{skipset} is not important.
28f540f4
RM
2028
2029For example,
2030@smallexample
2031strspn ("hello, world", "abcdefghijklmnopqrstuvwxyz")
2032 @result{} 5
2033@end smallexample
8a2f1f5b 2034
2cc4b9cc
PE
2035In a multibyte string, characters consisting of
2036more than one byte are not treated as single entities. Each byte is treated
8a2f1f5b
UD
2037separately. The function is not locale-dependent.
2038@end deftypefun
2039
8a2f1f5b 2040@deftypefun size_t wcsspn (const wchar_t *@var{wstring}, const wchar_t *@var{skipset})
d08a7e4c 2041@standards{ISO, wchar.h}
11087373 2042@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
2043The @code{wcsspn} (``wide character string span'') function returns the
2044length of the initial substring of @var{wstring} that consists entirely
2045of wide characters that are members of the set specified by the string
2046@var{skipset}. The order of the wide characters in @var{skipset} is not
2047important.
28f540f4
RM
2048@end deftypefun
2049
28f540f4 2050@deftypefun size_t strcspn (const char *@var{string}, const char *@var{stopset})
d08a7e4c 2051@standards{ISO, string.h}
11087373 2052@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4 2053The @code{strcspn} (``string complement span'') function returns the length
2cc4b9cc 2054of the initial substring of @var{string} that consists entirely of bytes
28f540f4 2055that are @emph{not} members of the set specified by the string @var{stopset}.
2cc4b9cc 2056(In other words, it returns the offset of the first byte in @var{string}
28f540f4
RM
2057that is a member of the set @var{stopset}.)
2058
2059For example,
2060@smallexample
2061strcspn ("hello, world", " \t\n,.;!?")
2062 @result{} 5
2063@end smallexample
8a2f1f5b 2064
2cc4b9cc
PE
2065In a multibyte string, characters consisting of
2066more than one byte are not treated as a single entities. Each byte is treated
8a2f1f5b
UD
2067separately. The function is not locale-dependent.
2068@end deftypefun
2069
8a2f1f5b 2070@deftypefun size_t wcscspn (const wchar_t *@var{wstring}, const wchar_t *@var{stopset})
d08a7e4c 2071@standards{ISO, wchar.h}
11087373 2072@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
2073The @code{wcscspn} (``wide character string complement span'') function
2074returns the length of the initial substring of @var{wstring} that
2075consists entirely of wide characters that are @emph{not} members of the
2076set specified by the string @var{stopset}. (In other words, it returns
2cc4b9cc 2077the offset of the first wide character in @var{string} that is a member of
8a2f1f5b 2078the set @var{stopset}.)
28f540f4
RM
2079@end deftypefun
2080
28f540f4 2081@deftypefun {char *} strpbrk (const char *@var{string}, const char *@var{stopset})
d08a7e4c 2082@standards{ISO, string.h}
11087373 2083@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4 2084The @code{strpbrk} (``string pointer break'') function is related to
2cc4b9cc 2085@code{strcspn}, except that it returns a pointer to the first byte
28f540f4
RM
2086in @var{string} that is a member of the set @var{stopset} instead of the
2087length of the initial substring. It returns a null pointer if no such
2cc4b9cc 2088byte from @var{stopset} is found.
28f540f4
RM
2089
2090@c @group Invalid outside the example.
2091For example,
2092
2093@smallexample
2094strpbrk ("hello, world", " \t\n,.;!?")
2095 @result{} ", world"
2096@end smallexample
2097@c @end group
8a2f1f5b 2098
2cc4b9cc
PE
2099In a multibyte string, characters consisting of
2100more than one byte are not treated as single entities. Each byte is treated
8a2f1f5b
UD
2101separately. The function is not locale-dependent.
2102@end deftypefun
2103
8a2f1f5b 2104@deftypefun {wchar_t *} wcspbrk (const wchar_t *@var{wstring}, const wchar_t *@var{stopset})
d08a7e4c 2105@standards{ISO, wchar.h}
11087373 2106@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
2107The @code{wcspbrk} (``wide character string pointer break'') function is
2108related to @code{wcscspn}, except that it returns a pointer to the first
2109wide character in @var{wstring} that is a member of the set
2110@var{stopset} instead of the length of the initial substring. It
2cc4b9cc 2111returns a null pointer if no such wide character from @var{stopset} is found.
28f540f4
RM
2112@end deftypefun
2113
0e4ee106
UD
2114
2115@subsection Compatibility String Search Functions
2116
0e4ee106 2117@deftypefun {char *} index (const char *@var{string}, int @var{c})
d08a7e4c 2118@standards{BSD, string.h}
11087373 2119@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
0e4ee106
UD
2120@code{index} is another name for @code{strchr}; they are exactly the same.
2121New code should always use @code{strchr} since this name is defined in
2122@w{ISO C} while @code{index} is a BSD invention which never was available
2123on @w{System V} derived systems.
2124@end deftypefun
2125
0e4ee106 2126@deftypefun {char *} rindex (const char *@var{string}, int @var{c})
d08a7e4c 2127@standards{BSD, string.h}
11087373 2128@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
0e4ee106
UD
2129@code{rindex} is another name for @code{strrchr}; they are exactly the same.
2130New code should always use @code{strrchr} since this name is defined in
2131@w{ISO C} while @code{rindex} is a BSD invention which never was available
2132on @w{System V} derived systems.
2133@end deftypefun
2134
b4012b75 2135@node Finding Tokens in a String
28f540f4
RM
2136@section Finding Tokens in a String
2137
28f540f4
RM
2138@cindex tokenizing strings
2139@cindex breaking a string into tokens
2140@cindex parsing tokens from a string
2141It's fairly common for programs to have a need to do some simple kinds
2142of lexical analysis and parsing, such as splitting a command string up
2143into tokens. You can do this with the @code{strtok} function, declared
2144in the header file @file{string.h}.
2145@pindex string.h
2146
8a2f1f5b 2147@deftypefun {char *} strtok (char *restrict @var{newstring}, const char *restrict @var{delimiters})
d08a7e4c 2148@standards{ISO, string.h}
11087373 2149@safety{@prelim{}@mtunsafe{@mtasurace{:strtok}}@asunsafe{}@acsafe{}}
28f540f4
RM
2150A string can be split into tokens by making a series of calls to the
2151function @code{strtok}.
2152
2153The string to be split up is passed as the @var{newstring} argument on
2154the first call only. The @code{strtok} function uses this to set up
2155some internal state information. Subsequent calls to get additional
2156tokens from the same string are indicated by passing a null pointer as
2157the @var{newstring} argument. Calling @code{strtok} with another
2158non-null @var{newstring} argument reinitializes the state information.
2159It is guaranteed that no other library function ever calls @code{strtok}
2160behind your back (which would mess up this internal state information).
2161
2162The @var{delimiters} argument is a string that specifies a set of delimiters
2cc4b9cc
PE
2163that may surround the token being extracted. All the initial bytes
2164that are members of this set are discarded. The first byte that is
28f540f4
RM
2165@emph{not} a member of this set of delimiters marks the beginning of the
2166next token. The end of the token is found by looking for the next
2cc4b9cc
PE
2167byte that is a member of the delimiter set. This byte in the
2168original string @var{newstring} is overwritten by a null byte, and the
28f540f4
RM
2169pointer to the beginning of the token in @var{newstring} is returned.
2170
2171On the next call to @code{strtok}, the searching begins at the next
2cc4b9cc 2172byte beyond the one that marked the end of the previous token.
28f540f4
RM
2173Note that the set of delimiters @var{delimiters} do not have to be the
2174same on every call in a series of calls to @code{strtok}.
2175
2176If the end of the string @var{newstring} is reached, or if the remainder of
2cc4b9cc 2177string consists only of delimiter bytes, @code{strtok} returns
28f540f4 2178a null pointer.
8a2f1f5b 2179
2cc4b9cc
PE
2180In a multibyte string, characters consisting of
2181more than one byte are not treated as single entities. Each byte is treated
8a2f1f5b
UD
2182separately. The function is not locale-dependent.
2183@end deftypefun
2184
1acd4371 2185@deftypefun {wchar_t *} wcstok (wchar_t *@var{newstring}, const wchar_t *@var{delimiters}, wchar_t **@var{save_ptr})
d08a7e4c 2186@standards{ISO, wchar.h}
11087373 2187@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
8a2f1f5b
UD
2188A string can be split into tokens by making a series of calls to the
2189function @code{wcstok}.
2190
2191The string to be split up is passed as the @var{newstring} argument on
2192the first call only. The @code{wcstok} function uses this to set up
2193some internal state information. Subsequent calls to get additional
2cc4b9cc 2194tokens from the same wide string are indicated by passing a
1acd4371
AO
2195null pointer as the @var{newstring} argument, which causes the pointer
2196previously stored in @var{save_ptr} to be used instead.
8a2f1f5b 2197
2cc4b9cc 2198The @var{delimiters} argument is a wide string that specifies
8a2f1f5b
UD
2199a set of delimiters that may surround the token being extracted. All
2200the initial wide characters that are members of this set are discarded.
2201The first wide character that is @emph{not} a member of this set of
2202delimiters marks the beginning of the next token. The end of the token
2203is found by looking for the next wide character that is a member of the
2cc4b9cc 2204delimiter set. This wide character in the original wide
1acd4371
AO
2205string @var{newstring} is overwritten by a null wide character, the
2206pointer past the overwritten wide character is saved in @var{save_ptr},
2207and the pointer to the beginning of the token in @var{newstring} is
2208returned.
8a2f1f5b
UD
2209
2210On the next call to @code{wcstok}, the searching begins at the next
2211wide character beyond the one that marked the end of the previous token.
2212Note that the set of delimiters @var{delimiters} do not have to be the
2213same on every call in a series of calls to @code{wcstok}.
2214
2cc4b9cc 2215If the end of the wide string @var{newstring} is reached, or
8a2f1f5b
UD
2216if the remainder of string consists only of delimiter wide characters,
2217@code{wcstok} returns a null pointer.
28f540f4
RM
2218@end deftypefun
2219
8a2f1f5b
UD
2220@strong{Warning:} Since @code{strtok} and @code{wcstok} alter the string
2221they is parsing, you should always copy the string to a temporary buffer
0a13c9e9
PE
2222before parsing it with @code{strtok}/@code{wcstok} (@pxref{Copying Strings
2223and Arrays}). If you allow @code{strtok} or @code{wcstok} to modify
8a2f1f5b
UD
2224a string that came from another part of your program, you are asking for
2225trouble; that string might be used for other purposes after
2226@code{strtok} or @code{wcstok} has modified it, and it would not have
2227the expected value.
28f540f4
RM
2228
2229The string that you are operating on might even be a constant. Then
8a2f1f5b
UD
2230when @code{strtok} or @code{wcstok} tries to modify it, your program
2231will get a fatal signal for writing in read-only memory. @xref{Program
2232Error Signals}. Even if the operation of @code{strtok} or @code{wcstok}
2233would not require a modification of the string (e.g., if there is
1f77f049 2234exactly one token) the string can (and in the @glibcadj{} case will) be
8a2f1f5b 2235modified.
28f540f4
RM
2236
2237This is a special case of a general principle: if a part of a program
2238does not have as its purpose the modification of a certain data
2239structure, then it is error-prone to modify the data structure
2240temporarily.
2241
1acd4371 2242The function @code{strtok} is not reentrant, whereas @code{wcstok} is.
8a2f1f5b
UD
2243@xref{Nonreentrancy}, for a discussion of where and why reentrancy is
2244important.
28f540f4
RM
2245
2246Here is a simple example showing the use of @code{strtok}.
2247
2248@comment Yes, this example has been tested.
2249@smallexample
2250#include <string.h>
2251#include <stddef.h>
2252
2253@dots{}
2254
5649a1d6 2255const char string[] = "words separated by spaces -- and, punctuation!";
28f540f4 2256const char delimiters[] = " .,;:!-";
5649a1d6 2257char *token, *cp;
28f540f4
RM
2258
2259@dots{}
2260
5649a1d6
UD
2261cp = strdupa (string); /* Make writable copy. */
2262token = strtok (cp, delimiters); /* token => "words" */
28f540f4
RM
2263token = strtok (NULL, delimiters); /* token => "separated" */
2264token = strtok (NULL, delimiters); /* token => "by" */
2265token = strtok (NULL, delimiters); /* token => "spaces" */
2266token = strtok (NULL, delimiters); /* token => "and" */
2267token = strtok (NULL, delimiters); /* token => "punctuation" */
2268token = strtok (NULL, delimiters); /* token => NULL */
2269@end smallexample
a5113b14 2270
1f77f049 2271@Theglibc{} contains two more functions for tokenizing a string
2cc4b9cc
PE
2272which overcome the limitation of non-reentrancy. They are not
2273available available for wide strings.
a5113b14 2274
a5113b14 2275@deftypefun {char *} strtok_r (char *@var{newstring}, const char *@var{delimiters}, char **@var{save_ptr})
d08a7e4c 2276@standards{POSIX, string.h}
11087373 2277@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
dd7d45e8
UD
2278Just like @code{strtok}, this function splits the string into several
2279tokens which can be accessed by successive calls to @code{strtok_r}.
1acd4371
AO
2280The difference is that, as in @code{wcstok}, the information about the
2281next token is stored in the space pointed to by the third argument,
2282@var{save_ptr}, which is a pointer to a string pointer. Calling
2283@code{strtok_r} with a null pointer for @var{newstring} and leaving
2284@var{save_ptr} between the calls unchanged does the job without
2285hindering reentrancy.
a5113b14 2286
976780fd 2287This function is defined in POSIX.1 and can be found on many systems
a5113b14
UD
2288which support multi-threading.
2289@end deftypefun
2290
a5113b14 2291@deftypefun {char *} strsep (char **@var{string_ptr}, const char *@var{delimiter})
d08a7e4c 2292@standards{BSD, string.h}
11087373 2293@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
0050ad5f
UD
2294This function has a similar functionality as @code{strtok_r} with the
2295@var{newstring} argument replaced by the @var{save_ptr} argument. The
2296initialization of the moving pointer has to be done by the user.
2297Successive calls to @code{strsep} move the pointer along the tokens
2298separated by @var{delimiter}, returning the address of the next token
2299and updating @var{string_ptr} to point to the beginning of the next
2300token.
2301
2302One difference between @code{strsep} and @code{strtok_r} is that if the
2cc4b9cc
PE
2303input string contains more than one byte from @var{delimiter} in a
2304row @code{strsep} returns an empty string for each pair of bytes
0050ad5f
UD
2305from @var{delimiter}. This means that a program normally should test
2306for @code{strsep} returning an empty string before processing it.
9afc8a59 2307
a5113b14
UD
2308This function was introduced in 4.3BSD and therefore is widely available.
2309@end deftypefun
2310
2311Here is how the above example looks like when @code{strsep} is used.
2312
2313@comment Yes, this example has been tested.
2314@smallexample
2315#include <string.h>
2316#include <stddef.h>
2317
2318@dots{}
2319
5649a1d6 2320const char string[] = "words separated by spaces -- and, punctuation!";
a5113b14
UD
2321const char delimiters[] = " .,;:!-";
2322char *running;
2323char *token;
2324
2325@dots{}
2326
5649a1d6 2327running = strdupa (string);
a5113b14
UD
2328token = strsep (&running, delimiters); /* token => "words" */
2329token = strsep (&running, delimiters); /* token => "separated" */
2330token = strsep (&running, delimiters); /* token => "by" */
2331token = strsep (&running, delimiters); /* token => "spaces" */
9afc8a59
UD
2332token = strsep (&running, delimiters); /* token => "" */
2333token = strsep (&running, delimiters); /* token => "" */
2334token = strsep (&running, delimiters); /* token => "" */
a5113b14 2335token = strsep (&running, delimiters); /* token => "and" */
9afc8a59 2336token = strsep (&running, delimiters); /* token => "" */
a5113b14 2337token = strsep (&running, delimiters); /* token => "punctuation" */
9afc8a59 2338token = strsep (&running, delimiters); /* token => "" */
a5113b14
UD
2339token = strsep (&running, delimiters); /* token => NULL */
2340@end smallexample
b4012b75 2341
ec28fc7c 2342@deftypefun {char *} basename (const char *@var{filename})
d08a7e4c 2343@standards{GNU, string.h}
11087373 2344@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
ec28fc7c 2345The GNU version of the @code{basename} function returns the last
9442cd75 2346component of the path in @var{filename}. This function is the preferred
ec28fc7c
UD
2347usage, since it does not modify the argument, @var{filename}, and
2348respects trailing slashes. The prototype for @code{basename} can be
ef48b196 2349found in @file{string.h}. Note, this function is overridden by the XPG
ec28fc7c
UD
2350version, if @file{libgen.h} is included.
2351
2352Example of using GNU @code{basename}:
2353
2354@smallexample
2355#include <string.h>
2356
2357int
2358main (int argc, char *argv[])
2359@{
2360 char *prog = basename (argv[0]);
2361
2362 if (argc < 2)
2363 @{
2364 fprintf (stderr, "Usage %s <arg>\n", prog);
2365 exit (1);
2366 @}
2367
2368 @dots{}
2369@}
2370@end smallexample
2371
2372@strong{Portability Note:} This function may produce different results
2373on different systems.
2374
2375@end deftypefun
2376
af85ebcd 2377@deftypefun {char *} basename (char *@var{path})
d08a7e4c 2378@standards{XPG, libgen.h}
11087373 2379@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
cf822e3c 2380This is the standard XPG defined @code{basename}. It is similar in
ec28fc7c 2381spirit to the GNU version, but may modify the @var{path} by removing
2cc4b9cc
PE
2382trailing '/' bytes. If the @var{path} is made up entirely of '/'
2383bytes, then "/" will be returned. Also, if @var{path} is
ec28fc7c 2384@code{NULL} or an empty string, then "." is returned. The prototype for
e4a5f77d 2385the XPG version can be found in @file{libgen.h}.
ec28fc7c
UD
2386
2387Example of using XPG @code{basename}:
2388
2389@smallexample
2390#include <libgen.h>
2391
2392int
2393main (int argc, char *argv[])
2394@{
2395 char *prog;
2396 char *path = strdupa (argv[0]);
2397
2398 prog = basename (path);
2399
2400 if (argc < 2)
2401 @{
2402 fprintf (stderr, "Usage %s <arg>\n", prog);
2403 exit (1);
2404 @}
2405
2406 @dots{}
2407
2408@}
2409@end smallexample
2410@end deftypefun
2411
ec28fc7c 2412@deftypefun {char *} dirname (char *@var{path})
d08a7e4c 2413@standards{XPG, libgen.h}
11087373 2414@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
ec28fc7c
UD
2415The @code{dirname} function is the compliment to the XPG version of
2416@code{basename}. It returns the parent directory of the file specified
2417by @var{path}. If @var{path} is @code{NULL}, an empty string, or
2cc4b9cc 2418contains no '/' bytes, then "." is returned. The prototype for this
ec28fc7c
UD
2419function can be found in @file{libgen.h}.
2420@end deftypefun
0e4ee106 2421
ea1bd74d
ZW
2422@node Erasing Sensitive Data
2423@section Erasing Sensitive Data
2424
2425Sensitive data, such as cryptographic keys, should be erased from
2426memory after use, to reduce the risk that a bug will expose it to the
2427outside world. However, compiler optimizations may determine that an
2428erasure operation is ``unnecessary,'' and remove it from the generated
2429code, because no @emph{correct} program could access the variable or
2430heap object containing the sensitive data after it's deallocated.
2431Since erasure is a precaution against bugs, this optimization is
2432inappropriate.
2433
2434The function @code{explicit_bzero} erases a block of memory, and
2435guarantees that the compiler will not remove the erasure as
2436``unnecessary.''
2437
2438@smallexample
2439@group
2440#include <string.h>
2441
2442extern void encrypt (const char *key, const char *in,
2443 char *out, size_t n);
2444extern void genkey (const char *phrase, char *key);
2445
2446void encrypt_with_phrase (const char *phrase, const char *in,
2447 char *out, size_t n)
2448@{
2449 char key[16];
2450 genkey (phrase, key);
2451 encrypt (key, in, out, n);
2452 explicit_bzero (key, 16);
2453@}
2454@end group
2455@end smallexample
2456
2457@noindent
2458In this example, if @code{memset}, @code{bzero}, or a hand-written
2459loop had been used, the compiler might remove them as ``unnecessary.''
2460
2461@strong{Warning:} @code{explicit_bzero} does not guarantee that
2462sensitive data is @emph{completely} erased from the computer's memory.
2463There may be copies in temporary storage areas, such as registers and
2464``scratch'' stack space; since these are invisible to the source code,
2465a library function cannot erase them.
2466
2467Also, @code{explicit_bzero} only operates on RAM. If a sensitive data
2468object never needs to have its address taken other than to call
2469@code{explicit_bzero}, it might be stored entirely in CPU registers
2470@emph{until} the call to @code{explicit_bzero}. Then it will be
2471copied into RAM, the copy will be erased, and the original will remain
2472intact. Data in RAM is more likely to be exposed by a bug than data
2473in registers, so this creates a brief window where the data is at
2474greater risk of exposure than it would have been if the program didn't
2475try to erase it at all.
2476
2477Declaring sensitive variables as @code{volatile} will make both the
2478above problems @emph{worse}; a @code{volatile} variable will be stored
2479in memory for its entire lifetime, and the compiler will make
2480@emph{more} copies of it than it would otherwise have. Attempting to
2481erase a normal variable ``by hand'' through a
2482@code{volatile}-qualified pointer doesn't work at all---because the
2483variable itself is not @code{volatile}, some compilers will ignore the
2484qualification on the pointer and remove the erasure anyway.
2485
2486Having said all that, in most situations, using @code{explicit_bzero}
2487is better than not using it. At present, the only way to do a more
2488thorough job is to write the entire sensitive operation in assembly
2489language. We anticipate that future compilers will recognize calls to
2490@code{explicit_bzero} and take appropriate steps to erase all the
8394b8c4 2491copies of the affected data, wherever they may be.
ea1bd74d 2492
ea1bd74d 2493@deftypefun void explicit_bzero (void *@var{block}, size_t @var{len})
d08a7e4c 2494@standards{BSD, string.h}
ea1bd74d
ZW
2495@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2496
2497@code{explicit_bzero} writes zero into @var{len} bytes of memory
2498beginning at @var{block}, just as @code{bzero} would. The zeroes are
2499always written, even if the compiler could determine that this is
2500``unnecessary'' because no correct program could read them back.
2501
2502@strong{Note:} The @emph{only} optimization that @code{explicit_bzero}
2503disables is removal of ``unnecessary'' writes to memory. The compiler
2504can perform all the other optimizations that it could for a call to
2505@code{memset}. For instance, it may replace the function call with
2506inline memory writes, and it may assume that @var{block} cannot be a
2507null pointer.
2508
2509@strong{Portability Note:} This function first appeared in OpenBSD 5.5
2510and has not been standardized. Other systems may provide the same
2511functionality under a different name, such as @code{explicit_memset},
2512@code{memset_s}, or @code{SecureZeroMemory}.
2513
2514@Theglibc{} declares this function in @file{string.h}, but on other
2515systems it may be in @file{strings.h} instead.
2516@end deftypefun
2517
b10a0acc
ZW
2518
2519@node Shuffling Bytes
2520@section Shuffling Bytes
0e4ee106
UD
2521
2522The function below addresses the perennial programming quandary: ``How do
2523I take good data in string form and painlessly turn it into garbage?''
b10a0acc
ZW
2524This is not a difficult thing to code for oneself, but the authors of
2525@theglibc{} wish to make it as convenient as possible.
0e4ee106 2526
b10a0acc
ZW
2527To @emph{erase} data, use @code{explicit_bzero} (@pxref{Erasing
2528Sensitive Data}); to obfuscate it reversibly, use @code{memfrob}
2529(@pxref{Obfuscating Data}).
0e4ee106 2530
ec28fc7c 2531@deftypefun {char *} strfry (char *@var{string})
d08a7e4c 2532@standards{GNU, string.h}
11087373
AO
2533@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2534@c Calls initstate_r, time, getpid, strlen, and random_r.
0e4ee106 2535
b10a0acc
ZW
2536@code{strfry} performs an in-place shuffle on @var{string}. Each
2537character is swapped to a position selected at random, within the
2538portion of the string starting with the character's original position.
2539(This is the Fisher-Yates algorithm for unbiased shuffling.)
2540
2541Calling @code{strfry} will not disturb any of the random number
2542generators that have global state (@pxref{Pseudo-Random Numbers}).
0e4ee106
UD
2543
2544The return value of @code{strfry} is always @var{string}.
2545
1f77f049 2546@strong{Portability Note:} This function is unique to @theglibc{}.
b10a0acc 2547It is declared in @file{string.h}.
0e4ee106
UD
2548@end deftypefun
2549
2550
b10a0acc
ZW
2551@node Obfuscating Data
2552@section Obfuscating Data
0e4ee106
UD
2553@cindex Rot13
2554
b10a0acc
ZW
2555The @code{memfrob} function reversibly obfuscates an array of binary
2556data. This is not true encryption; the obfuscated data still bears a
2557clear relationship to the original, and no secret key is required to
2558undo the obfuscation. It is analogous to the ``Rot13'' cipher used on
2559Usenet for obscuring offensive jokes, spoilers for works of fiction,
2560and so on, but it can be applied to arbitrary binary data.
0e4ee106 2561
b10a0acc
ZW
2562Programs that need true encryption---a transformation that completely
2563obscures the original and cannot be reversed without knowledge of a
2564secret key---should use a dedicated cryptography library, such as
2565@uref{https://www.gnu.org/software/libgcrypt/,,libgcrypt}.
2566
2567Programs that need to @emph{destroy} data should use
2568@code{explicit_bzero} (@pxref{Erasing Sensitive Data}), or possibly
2569@code{strfry} (@pxref{Shuffling Bytes}).
0e4ee106 2570
0e4ee106 2571@deftypefun {void *} memfrob (void *@var{mem}, size_t @var{length})
d08a7e4c 2572@standards{GNU, string.h}
11087373 2573@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
0e4ee106 2574
b10a0acc
ZW
2575The function @code{memfrob} obfuscates @var{length} bytes of data
2576beginning at @var{mem}, in place. Each byte is bitwise xor-ed with
2577the binary pattern 00101010 (hexadecimal 0x2A). The return value is
2578always @var{mem}.
0e4ee106 2579
b10a0acc
ZW
2580@code{memfrob} a second time on the same data returns it to
2581its original state.
0e4ee106 2582
1f77f049 2583@strong{Portability Note:} This function is unique to @theglibc{}.
b10a0acc 2584It is declared in @file{string.h}.
0e4ee106
UD
2585@end deftypefun
2586
b4012b75
UD
2587@node Encode Binary Data
2588@section Encode Binary Data
2589
2590To store or transfer binary data in environments which only support text
2591one has to encode the binary data by mapping the input bytes to
2cc4b9cc 2592bytes in the range allowed for storing or transferring. SVID
dd7d45e8
UD
2593systems (and nowadays XPG compliant systems) provide minimal support for
2594this task.
b4012b75 2595
b4012b75 2596@deftypefun {char *} l64a (long int @var{n})
d08a7e4c 2597@standards{XPG, stdlib.h}
11087373 2598@safety{@prelim{}@mtunsafe{@mtasurace{:l64a}}@asunsafe{}@acsafe{}}
2cc4b9cc
PE
2599This function encodes a 32-bit input value using bytes from the
2600basic character set. It returns a pointer to a 7 byte buffer which
dd7d45e8
UD
2601contains an encoded version of @var{n}. To encode a series of bytes the
2602user must copy the returned string to a destination buffer. It returns
2603the empty string if @var{n} is zero, which is somewhat bizarre but
2604mandated by the standard.@*
2605@strong{Warning:} Since a static buffer is used this function should not
5649a1d6 2606be used in multi-threaded programs. There is no thread-safe alternative
dd7d45e8
UD
2607to this function in the C library.@*
2608@strong{Compatibility Note:} The XPG standard states that the return
2609value of @code{l64a} is undefined if @var{n} is negative. In the GNU
2610implementation, @code{l64a} treats its argument as unsigned, so it will
2611return a sensible encoding for any nonzero @var{n}; however, portable
2612programs should not rely on this.
b4012b75 2613
dd7d45e8
UD
2614To encode a large buffer @code{l64a} must be called in a loop, once for
2615each 32-bit word of the buffer. For example, one could do something
2616like this:
5649a1d6
UD
2617
2618@smallexample
2619char *
2620encode (const void *buf, size_t len)
2621@{
2622 /* @r{We know in advance how long the buffer has to be.} */
2623 unsigned char *in = (unsigned char *) buf;
2624 char *out = malloc (6 + ((len + 3) / 4) * 6 + 1);
290639c3 2625 char *cp = out, *p;
5649a1d6
UD
2626
2627 /* @r{Encode the length.} */
dd7d45e8 2628 /* @r{Using `htonl' is necessary so that the data can be}
290639c3
UD
2629 @r{decoded even on machines with different byte order.}
2630 @r{`l64a' can return a string shorter than 6 bytes, so }
2631 @r{we pad it with encoding of 0 (}'.'@r{) at the end by }
2632 @r{hand.} */
dd7d45e8 2633
290639c3
UD
2634 p = stpcpy (cp, l64a (htonl (len)));
2635 cp = mempcpy (p, "......", 6 - (p - cp));
5649a1d6
UD
2636
2637 while (len > 3)
2638 @{
2639 unsigned long int n = *in++;
2640 n = (n << 8) | *in++;
2641 n = (n << 8) | *in++;
2642 n = (n << 8) | *in++;
2643 len -= 4;
290639c3
UD
2644 p = stpcpy (cp, l64a (htonl (n)));
2645 cp = mempcpy (p, "......", 6 - (p - cp));
5649a1d6
UD
2646 @}
2647 if (len > 0)
2648 @{
2649 unsigned long int n = *in++;
2650 if (--len > 0)
2651 @{
2652 n = (n << 8) | *in++;
2653 if (--len > 0)
2654 n = (n << 8) | *in;
2655 @}
290639c3 2656 cp = stpcpy (cp, l64a (htonl (n)));
5649a1d6
UD
2657 @}
2658 *cp = '\0';
2659 return out;
2660@}
2661@end smallexample
2662
2663It is strange that the library does not provide the complete
dd7d45e8
UD
2664functionality needed but so be it.
2665
2666@end deftypefun
5649a1d6 2667
b4012b75
UD
2668To decode data produced with @code{l64a} the following function should be
2669used.
2670
2671@deftypefun {long int} a64l (const char *@var{string})
d08a7e4c 2672@standards{XPG, stdlib.h}
11087373 2673@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b4012b75 2674The parameter @var{string} should contain a string which was produced by
2cc4b9cc
PE
2675a call to @code{l64a}. The function processes at least 6 bytes of
2676this string, and decodes the bytes it finds according to the table
2677below. It stops decoding when it finds a byte not in the table,
dd7d45e8 2678rather like @code{atoi}; if you have a buffer which has been broken into
2cc4b9cc 2679lines, you must be careful to skip over the end-of-line bytes.
dd7d45e8
UD
2680
2681The decoded number is returned as a @code{long int} value.
b4012b75 2682@end deftypefun
b13927da 2683
dd7d45e8 2684The @code{l64a} and @code{a64l} functions use a base 64 encoding, in
2cc4b9cc 2685which each byte of an encoded string represents six bits of an
dd7d45e8
UD
2686input word. These symbols are used for the base 64 digits:
2687
2688@multitable {xxxxx} {xxx} {xxx} {xxx} {xxx} {xxx} {xxx} {xxx} {xxx}
2689@item @tab 0 @tab 1 @tab 2 @tab 3 @tab 4 @tab 5 @tab 6 @tab 7
2690@item 0 @tab @code{.} @tab @code{/} @tab @code{0} @tab @code{1}
2691 @tab @code{2} @tab @code{3} @tab @code{4} @tab @code{5}
2692@item 8 @tab @code{6} @tab @code{7} @tab @code{8} @tab @code{9}
2693 @tab @code{A} @tab @code{B} @tab @code{C} @tab @code{D}
2694@item 16 @tab @code{E} @tab @code{F} @tab @code{G} @tab @code{H}
2695 @tab @code{I} @tab @code{J} @tab @code{K} @tab @code{L}
2696@item 24 @tab @code{M} @tab @code{N} @tab @code{O} @tab @code{P}
2697 @tab @code{Q} @tab @code{R} @tab @code{S} @tab @code{T}
2698@item 32 @tab @code{U} @tab @code{V} @tab @code{W} @tab @code{X}
2699 @tab @code{Y} @tab @code{Z} @tab @code{a} @tab @code{b}
2700@item 40 @tab @code{c} @tab @code{d} @tab @code{e} @tab @code{f}
2701 @tab @code{g} @tab @code{h} @tab @code{i} @tab @code{j}
2702@item 48 @tab @code{k} @tab @code{l} @tab @code{m} @tab @code{n}
2703 @tab @code{o} @tab @code{p} @tab @code{q} @tab @code{r}
2704@item 56 @tab @code{s} @tab @code{t} @tab @code{u} @tab @code{v}
2705 @tab @code{w} @tab @code{x} @tab @code{y} @tab @code{z}
2706@end multitable
2707
2708This encoding scheme is not standard. There are some other encoding
2709methods which are much more widely used (UU encoding, MIME encoding).
2710Generally, it is better to use one of these encodings.
2711
b13927da
UD
2712@node Argz and Envz Vectors
2713@section Argz and Envz Vectors
2714
5649a1d6 2715@cindex argz vectors (string vectors)
2cc4b9cc
PE
2716@cindex string vectors, null-byte separated
2717@cindex argument vectors, null-byte separated
b13927da 2718@dfn{argz vectors} are vectors of strings in a contiguous block of
2cc4b9cc 2719memory, each element separated from its neighbors by null bytes
b13927da
UD
2720(@code{'\0'}).
2721
5649a1d6 2722@cindex envz vectors (environment vectors)
2cc4b9cc 2723@cindex environment vectors, null-byte separated
b13927da 2724@dfn{Envz vectors} are an extension of argz vectors where each element is a
2cc4b9cc 2725name-value pair, separated by a @code{'='} byte (as in a Unix
b13927da
UD
2726environment).
2727
2728@menu
2729* Argz Functions:: Operations on argz vectors.
2730* Envz Functions:: Additional operations on environment vectors.
2731@end menu
2732
2733@node Argz Functions, Envz Functions, , Argz and Envz Vectors
2734@subsection Argz Functions
2735
2736Each argz vector is represented by a pointer to the first element, of
2737type @code{char *}, and a size, of type @code{size_t}, both of which can
2738be initialized to @code{0} to represent an empty argz vector. All argz
2739functions accept either a pointer and a size argument, or pointers to
2740them, if they will be modified.
2741
2742The argz functions use @code{malloc}/@code{realloc} to allocate/grow
f0f308c1 2743argz vectors, and so any argz vector created using these functions may
b13927da
UD
2744be freed by using @code{free}; conversely, any argz function that may
2745grow a string expects that string to have been allocated using
2746@code{malloc} (those argz functions that only examine their arguments or
2747modify them in place will work on any sort of memory).
2748@xref{Unconstrained Allocation}.
2749
2750All argz functions that do memory allocation have a return type of
2751@code{error_t}, and return @code{0} for success, and @code{ENOMEM} if an
2752allocation error occurs.
2753
2754@pindex argz.h
2755These functions are declared in the standard include file @file{argz.h}.
2756
2757@deftypefun {error_t} argz_create (char *const @var{argv}[], char **@var{argz}, size_t *@var{argz_len})
d08a7e4c 2758@standards{GNU, argz.h}
11087373 2759@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
5649a1d6 2760The @code{argz_create} function converts the Unix-style argument vector
b13927da
UD
2761@var{argv} (a vector of pointers to normal C strings, terminated by
2762@code{(char *)0}; @pxref{Program Arguments}) into an argz vector with
2763the same elements, which is returned in @var{argz} and @var{argz_len}.
2764@end deftypefun
2765
2766@deftypefun {error_t} argz_create_sep (const char *@var{string}, int @var{sep}, char **@var{argz}, size_t *@var{argz_len})
d08a7e4c 2767@standards{GNU, argz.h}
11087373 2768@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2cc4b9cc 2769The @code{argz_create_sep} function converts the string
b13927da 2770@var{string} into an argz vector (returned in @var{argz} and
49c091e5 2771@var{argz_len}) by splitting it into elements at every occurrence of the
2cc4b9cc 2772byte @var{sep}.
b13927da
UD
2773@end deftypefun
2774
f0f308c1 2775@deftypefun {size_t} argz_count (const char *@var{argz}, size_t @var{argz_len})
d08a7e4c 2776@standards{GNU, argz.h}
11087373 2777@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b13927da
UD
2778Returns the number of elements in the argz vector @var{argz} and
2779@var{argz_len}.
2780@end deftypefun
2781
8ded91fb 2782@deftypefun {void} argz_extract (const char *@var{argz}, size_t @var{argz_len}, char **@var{argv})
d08a7e4c 2783@standards{GNU, argz.h}
11087373 2784@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b13927da 2785The @code{argz_extract} function converts the argz vector @var{argz} and
5649a1d6 2786@var{argz_len} into a Unix-style argument vector stored in @var{argv},
b13927da
UD
2787by putting pointers to every element in @var{argz} into successive
2788positions in @var{argv}, followed by a terminator of @code{0}.
2789@var{Argv} must be pre-allocated with enough space to hold all the
2790elements in @var{argz} plus the terminating @code{(char *)0}
2791(@code{(argz_count (@var{argz}, @var{argz_len}) + 1) * sizeof (char *)}
2792bytes should be enough). Note that the string pointers stored into
2793@var{argv} point into @var{argz}---they are not copies---and so
2794@var{argz} must be copied if it will be changed while @var{argv} is
2795still active. This function is useful for passing the elements in
2796@var{argz} to an exec function (@pxref{Executing a File}).
2797@end deftypefun
2798
2799@deftypefun {void} argz_stringify (char *@var{argz}, size_t @var{len}, int @var{sep})
d08a7e4c 2800@standards{GNU, argz.h}
11087373 2801@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b13927da 2802The @code{argz_stringify} converts @var{argz} into a normal string with
2cc4b9cc 2803the elements separated by the byte @var{sep}, by replacing each
b13927da
UD
2804@code{'\0'} inside @var{argz} (except the last one, which terminates the
2805string) with @var{sep}. This is handy for printing @var{argz} in a
2806readable manner.
2807@end deftypefun
2808
2809@deftypefun {error_t} argz_add (char **@var{argz}, size_t *@var{argz_len}, const char *@var{str})
d08a7e4c 2810@standards{GNU, argz.h}
11087373
AO
2811@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2812@c Calls strlen and argz_append.
b13927da
UD
2813The @code{argz_add} function adds the string @var{str} to the end of the
2814argz vector @code{*@var{argz}}, and updates @code{*@var{argz}} and
2815@code{*@var{argz_len}} accordingly.
2816@end deftypefun
2817
2818@deftypefun {error_t} argz_add_sep (char **@var{argz}, size_t *@var{argz_len}, const char *@var{str}, int @var{delim})
d08a7e4c 2819@standards{GNU, argz.h}
11087373 2820@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
b13927da 2821The @code{argz_add_sep} function is similar to @code{argz_add}, but
49c091e5 2822@var{str} is split into separate elements in the result at occurrences of
2cc4b9cc 2823the byte @var{delim}. This is useful, for instance, for
5649a1d6 2824adding the components of a Unix search path to an argz vector, by using
b13927da
UD
2825a value of @code{':'} for @var{delim}.
2826@end deftypefun
2827
2828@deftypefun {error_t} argz_append (char **@var{argz}, size_t *@var{argz_len}, const char *@var{buf}, size_t @var{buf_len})
d08a7e4c 2829@standards{GNU, argz.h}
11087373 2830@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
b13927da
UD
2831The @code{argz_append} function appends @var{buf_len} bytes starting at
2832@var{buf} to the argz vector @code{*@var{argz}}, reallocating
2833@code{*@var{argz}} to accommodate it, and adding @var{buf_len} to
2834@code{*@var{argz_len}}.
2835@end deftypefun
2836
30aa5785 2837@deftypefun {void} argz_delete (char **@var{argz}, size_t *@var{argz_len}, char *@var{entry})
d08a7e4c 2838@standards{GNU, argz.h}
11087373
AO
2839@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2840@c Calls free if no argument is left.
b13927da
UD
2841If @var{entry} points to the beginning of one of the elements in the
2842argz vector @code{*@var{argz}}, the @code{argz_delete} function will
2843remove this entry and reallocate @code{*@var{argz}}, modifying
2844@code{*@var{argz}} and @code{*@var{argz_len}} accordingly. Note that as
2845destructive argz functions usually reallocate their argz argument,
2846pointers into argz vectors such as @var{entry} will then become invalid.
2847@end deftypefun
2848
2849@deftypefun {error_t} argz_insert (char **@var{argz}, size_t *@var{argz_len}, char *@var{before}, const char *@var{entry})
d08a7e4c 2850@standards{GNU, argz.h}
11087373
AO
2851@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2852@c Calls argz_add or realloc and memmove.
b13927da
UD
2853The @code{argz_insert} function inserts the string @var{entry} into the
2854argz vector @code{*@var{argz}} at a point just before the existing
2855element pointed to by @var{before}, reallocating @code{*@var{argz}} and
2856updating @code{*@var{argz}} and @code{*@var{argz_len}}. If @var{before}
2857is @code{0}, @var{entry} is added to the end instead (as if by
2858@code{argz_add}). Since the first element is in fact the same as
2859@code{*@var{argz}}, passing in @code{*@var{argz}} as the value of
2860@var{before} will result in @var{entry} being inserted at the beginning.
2861@end deftypefun
2862
8ded91fb 2863@deftypefun {char *} argz_next (const char *@var{argz}, size_t @var{argz_len}, const char *@var{entry})
d08a7e4c 2864@standards{GNU, argz.h}
11087373 2865@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b13927da
UD
2866The @code{argz_next} function provides a convenient way of iterating
2867over the elements in the argz vector @var{argz}. It returns a pointer
2868to the next element in @var{argz} after the element @var{entry}, or
2869@code{0} if there are no elements following @var{entry}. If @var{entry}
2870is @code{0}, the first element of @var{argz} is returned.
2871
2872This behavior suggests two styles of iteration:
2873
2874@smallexample
2875 char *entry = 0;
2876 while ((entry = argz_next (@var{argz}, @var{argz_len}, entry)))
2877 @var{action};
2878@end smallexample
2879
2880(the double parentheses are necessary to make some C compilers shut up
2881about what they consider a questionable @code{while}-test) and:
2882
2883@smallexample
2884 char *entry;
2885 for (entry = @var{argz};
2886 entry;
2887 entry = argz_next (@var{argz}, @var{argz_len}, entry))
2888 @var{action};
2889@end smallexample
2890
2891Note that the latter depends on @var{argz} having a value of @code{0} if
2892it is empty (rather than a pointer to an empty block of memory); this
2893invariant is maintained for argz vectors created by the functions here.
2894@end deftypefun
2895
d705269e 2896@deftypefun error_t argz_replace (@w{char **@var{argz}, size_t *@var{argz_len}}, @w{const char *@var{str}, const char *@var{with}}, @w{unsigned *@var{replace_count}})
d08a7e4c 2897@standards{GNU, argz.h}
11087373 2898@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
49c091e5 2899Replace any occurrences of the string @var{str} in @var{argz} with
d705269e
UD
2900@var{with}, reallocating @var{argz} as necessary. If
2901@var{replace_count} is non-zero, @code{*@var{replace_count}} will be
f0f308c1 2902incremented by the number of replacements performed.
d705269e
UD
2903@end deftypefun
2904
b13927da
UD
2905@node Envz Functions, , Argz Functions, Argz and Envz Vectors
2906@subsection Envz Functions
2907
2908Envz vectors are just argz vectors with additional constraints on the form
2909of each element; as such, argz functions can also be used on them, where it
2910makes sense.
2911
2912Each element in an envz vector is a name-value pair, separated by a @code{'='}
2cc4b9cc 2913byte; if multiple @code{'='} bytes are present in an element, those
b13927da 2914after the first are considered part of the value, and treated like all other
2cc4b9cc 2915non-@code{'\0'} bytes.
b13927da 2916
2cc4b9cc 2917If @emph{no} @code{'='} bytes are present in an element, that element is
b13927da
UD
2918considered the name of a ``null'' entry, as distinct from an entry with an
2919empty value: @code{envz_get} will return @code{0} if given the name of null
2920entry, whereas an entry with an empty value would result in a value of
2921@code{""}; @code{envz_entry} will still find such entries, however. Null
f0f308c1 2922entries can be removed with the @code{envz_strip} function.
b13927da
UD
2923
2924As with argz functions, envz functions that may allocate memory (and thus
2925fail) have a return type of @code{error_t}, and return either @code{0} or
2926@code{ENOMEM}.
2927
2928@pindex envz.h
2929These functions are declared in the standard include file @file{envz.h}.
2930
2931@deftypefun {char *} envz_entry (const char *@var{envz}, size_t @var{envz_len}, const char *@var{name})
d08a7e4c 2932@standards{GNU, envz.h}
11087373 2933@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b13927da
UD
2934The @code{envz_entry} function finds the entry in @var{envz} with the name
2935@var{name}, and returns a pointer to the whole entry---that is, the argz
2cc4b9cc 2936element which begins with @var{name} followed by a @code{'='} byte. If
b13927da
UD
2937there is no entry with that name, @code{0} is returned.
2938@end deftypefun
2939
2940@deftypefun {char *} envz_get (const char *@var{envz}, size_t @var{envz_len}, const char *@var{name})
d08a7e4c 2941@standards{GNU, envz.h}
11087373 2942@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b13927da
UD
2943The @code{envz_get} function finds the entry in @var{envz} with the name
2944@var{name} (like @code{envz_entry}), and returns a pointer to the value
2945portion of that entry (following the @code{'='}). If there is no entry with
2946that name (or only a null entry), @code{0} is returned.
2947@end deftypefun
2948
2949@deftypefun {error_t} envz_add (char **@var{envz}, size_t *@var{envz_len}, const char *@var{name}, const char *@var{value})
d08a7e4c 2950@standards{GNU, envz.h}
11087373
AO
2951@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2952@c Calls envz_remove, which calls enz_entry and argz_delete, and then
2953@c argz_add or equivalent code that reallocs and appends name=value.
b13927da
UD
2954The @code{envz_add} function adds an entry to @code{*@var{envz}}
2955(updating @code{*@var{envz}} and @code{*@var{envz_len}}) with the name
2956@var{name}, and value @var{value}. If an entry with the same name
2957already exists in @var{envz}, it is removed first. If @var{value} is
f0f308c1 2958@code{0}, then the new entry will be the special null type of entry
b13927da
UD
2959(mentioned above).
2960@end deftypefun
2961
2962@deftypefun {error_t} envz_merge (char **@var{envz}, size_t *@var{envz_len}, const char *@var{envz2}, size_t @var{envz2_len}, int @var{override})
d08a7e4c 2963@standards{GNU, envz.h}
11087373 2964@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
b13927da
UD
2965The @code{envz_merge} function adds each entry in @var{envz2} to @var{envz},
2966as if with @code{envz_add}, updating @code{*@var{envz}} and
2967@code{*@var{envz_len}}. If @var{override} is true, then values in @var{envz2}
2968will supersede those with the same name in @var{envz}, otherwise not.
2969
2970Null entries are treated just like other entries in this respect, so a null
2971entry in @var{envz} can prevent an entry of the same name in @var{envz2} from
2972being added to @var{envz}, if @var{override} is false.
2973@end deftypefun
2974
2975@deftypefun {void} envz_strip (char **@var{envz}, size_t *@var{envz_len})
d08a7e4c 2976@standards{GNU, envz.h}
11087373 2977@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b13927da
UD
2978The @code{envz_strip} function removes any null entries from @var{envz},
2979updating @code{*@var{envz}} and @code{*@var{envz_len}}.
2980@end deftypefun
11087373 2981
920d7012 2982@deftypefun {void} envz_remove (char **@var{envz}, size_t *@var{envz_len}, const char *@var{name})
d08a7e4c 2983@standards{GNU, envz.h}
654055e0 2984@safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
920d7012
SP
2985The @code{envz_remove} function removes an entry named @var{name} from
2986@var{envz}, updating @code{*@var{envz}} and @code{*@var{envz_len}}.
2987@end deftypefun
2988
11087373
AO
2989@c FIXME this are undocumented:
2990@c strcasecmp_l @safety{@mtsafe{}@assafe{}@acsafe{}} see strcasecmp
This page took 1.094451 seconds and 6 git commands to generate.