This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH 06/37] Manual typos: Character Set Handling


2016-05-06  Rical Jasan  <ricaljasan@pacific.net>

	* manual/charset.texi: Fix typos in the manual.
---
 manual/charset.texi |   68 +++++++++++++++++++++++++--------------------------
 1 file changed, 34 insertions(+), 34 deletions(-)

diff --git a/manual/charset.texi b/manual/charset.texi
index 68aecd3..147d9c5 100644
--- a/manual/charset.texi
+++ b/manual/charset.texi
@@ -31,7 +31,7 @@ library to support multiple character sets.
 @node Extended Char Intro
 @section Introduction to Extended Characters
 
-A variety of solutions is available to overcome the differences between
+A variety of solutions are available to overcome the differences between
 character sets with a 1:1 relation between bytes and characters and
 character sets with ratios of 2:1 or 4:1.  The remainder of this
 section gives a few examples to help understand the design decisions
@@ -202,7 +202,7 @@ defined in @file{wchar.h}.
 @end deftypevr
 
 
-These internal representations present problems when it comes to storing
+These internal representations present problems when it comes to storage
 and transmittal.  Because each single wide character consists of more
 than one byte, they are affected by byte-ordering.  Thus, machines with
 different endianesses would see different values when accessing the same
@@ -389,7 +389,7 @@ the conversion is necessary take a look at the @code{iconv} functions
 @subsection Selecting the conversion and its properties
 
 We already said above that the currently selected locale for the
-@code{LC_CTYPE} category decides about the conversion that is performed
+@code{LC_CTYPE} category decides the conversion that is performed
 by the functions we are about to describe.  Each locale uses its own
 character set (given as an argument to @code{localedef}) and this is the
 one assumed as the external multibyte encoding.  The wide character
@@ -549,7 +549,7 @@ necessary output code (@pxref{Converting Strings}).  Please note that with
 @theglibc{} it is not necessary to perform this extra action for the
 conversion from multibyte text to wide character text since the wide
 character encoding is not stateful.  But there is nothing mentioned in
-any standard that prohibits making @code{wchar_t} using a stateful
+any standard that prohibits making @code{wchar_t} use a stateful
 encoding.
 
 @node Converting a Character
@@ -559,7 +559,7 @@ The most fundamental of the conversion functions are those dealing with
 single characters.  Please note that this does not always mean single
 bytes.  But since there is very often a subset of the multibyte
 character set that consists of single byte sequences, there are
-functions to help with converting bytes.  Frequently, ASCII is a subpart
+functions to help with converting bytes.  Frequently, ASCII is a subset
 of the multibyte character set.  In such a scenario, each ASCII character
 stands for itself, and all other characters have at least a first byte
 that is beyond the range @math{0} to @math{127}.
@@ -596,7 +596,7 @@ and is declared in @file{wchar.h}.
 Despite the limitation that the single byte value is always interpreted
 in the initial state, this function is actually useful most of the time.
 Most characters are either entirely single-byte character sets or they
-are extension to ASCII.  But then it is possible to write code like this
+are extensions to ASCII.  But then it is possible to write code like this
 (not that this specific example is very useful):
 
 @smallexample
@@ -643,7 +643,7 @@ value of this function is this character.  Otherwise the return value is
 is declared in @file{wchar.h}.
 @end deftypefun
 
-There are more general functions to convert single character from
+There are more general functions to convert single characters from
 multibyte representation to wide characters and vice versa.  These
 functions pose no limit on the length of the multibyte representation
 and they also do not require it to be in the initial state.
@@ -731,7 +731,7 @@ bytes is adjusted.
 
 The only non-obvious thing about @code{mbrtowc} might be the way memory
 is allocated for the result.  The above code uses the fact that there
-can never be more wide characters in the converted results than there are
+can never be more wide characters in the converted result than there are
 bytes in the multibyte input string.  This method yields a pessimistic
 guess about the size of the result, and if many wide character strings
 have to be constructed this way or if the strings are long, the extra
@@ -813,7 +813,7 @@ Therefore, the @code{mbrlen} function will never read invalid memory.
 
 Now that this function is available (just to make this clear, this
 function is @emph{not} part of @theglibc{}) we can compute the
-number of wide character required to store the converted multibyte
+number of wide characters required to store the converted multibyte
 character string @var{s} using
 
 @smallexample
@@ -879,7 +879,7 @@ multibyte'') converts a single wide character into a multibyte string
 corresponding to that wide character.
 
 If @var{s} is a null pointer, the function resets the state stored in
-the objects pointed to by @var{ps} (or the internal @code{mbstate_t}
+the object pointed to by @var{ps} (or the internal @code{mbstate_t}
 object) to the initial state.  This can also be achieved by a call like
 this:
 
@@ -1020,7 +1020,7 @@ extensions that can help in some important situations.
 @deftypefun size_t mbsrtowcs (wchar_t *restrict @var{dst}, const char **restrict @var{src}, size_t @var{len}, mbstate_t *restrict @var{ps})
 @safety{@prelim{}@mtunsafe{@mtasurace{:mbsrtowcs/!ps}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
 The @code{mbsrtowcs} function (``multibyte string restartable to wide
-character string'') converts a NUL-terminated multibyte character
+character string'') converts the NUL-terminated multibyte character
 string at @code{*@var{src}} into an equivalent wide character string,
 including the NUL wide character at the end.  The conversion is started
 using the state information from the object pointed to by @var{ps} or
@@ -1061,7 +1061,7 @@ declared in @file{wchar.h}.
 The definition of the @code{mbsrtowcs} function has one important
 limitation.  The requirement that @var{dst} has to be a NUL-terminated
 string provides problems if one wants to convert buffers with text.  A
-buffer is normally no collection of NUL-terminated strings but instead a
+buffer is not normally a collection of NUL-terminated strings but instead a
 continuous collection of lines, separated by newline characters.  Now
 assume that a function to convert one line from a buffer is needed.  Since
 the line is not NUL-terminated, the source pointer cannot directly point
@@ -1078,7 +1078,7 @@ guess.
 @cindex stateful
 There is still a problem with the method of NUL-terminating a line right
 after the newline character, which could lead to very strange results.
-As said in the description of the @code{mbsrtowcs} function above the
+As said in the description of the @code{mbsrtowcs} function above, the
 conversion state is guaranteed to be in the initial shift state after
 processing the NUL byte at the end of the input string.  But this NUL
 byte is not really part of the text (i.e., the conversion state after
@@ -1110,7 +1110,7 @@ multibyte string'') converts the NUL-terminated wide character string at
 stores the result in the array pointed to by @var{dst}.  The NUL wide
 character is also converted.  The conversion starts in the state
 described in the object pointed to by @var{ps} or by a state object
-locally to @code{wcsrtombs} in case @var{ps} is a null pointer.  If
+local to @code{wcsrtombs} in case @var{ps} is a null pointer.  If
 @var{dst} is a null pointer, the conversion is performed as usual but the
 result is not available.  If all characters of the input string were
 successfully converted and if @var{dst} is not a null pointer, the
@@ -1123,13 +1123,13 @@ variable @code{errno} to @code{EILSEQ}, and returns @code{(size_t) -1}.
 Another reason for a premature stop is if @var{dst} is not a null
 pointer and the next converted character would require more than
 @var{len} bytes in total to the array @var{dst}.  In this case (and if
-@var{dest} is not a null pointer) the pointer pointed to by @var{src} is
+@var{dst} is not a null pointer) the pointer pointed to by @var{src} is
 assigned a value pointing to the wide character right after the last one
 successfully converted.
 
 Except in the case of an encoding error the return value of the
 @code{wcsrtombs} function is the number of bytes in all the multibyte
-character sequences stored in @var{dst}.  Before returning the state in
+character sequences stored in @var{dst}.  Before returning, the state in
 the object pointed to by @var{ps} (or the internal object in case
 @var{ps} is a null pointer) is updated to reflect the state after the
 last conversion.  The state is the initial shift state in case the
@@ -1158,11 +1158,11 @@ This new parameter specifies how many bytes at most can be used from the
 multibyte character string.  In other words, the multibyte character
 string @code{*@var{src}} need not be NUL-terminated.  But if a NUL byte
 is found within the @var{nmc} first bytes of the string, the conversion
-stops here.
+stops there.
 
 This function is a GNU extension.  It is meant to work around the
 problems mentioned above.  Now it is possible to convert a buffer with
-multibyte character text piece for piece without having to care about
+multibyte character text piece by piece without having to care about
 inserting NUL bytes and the effect of NUL bytes on the conversion state.
 @end deftypefun
 
@@ -1603,7 +1603,7 @@ common that they operate on character sets that are not directly
 specified by the functions.  The multibyte encoding used is specified by
 the currently selected locale for the @code{LC_CTYPE} category.  The
 wide character set is fixed by the implementation (in the case of @theglibc{}
-it is always UCS-4 encoded @w{ISO 10646}.
+it is always UCS-4 encoded @w{ISO 10646}).
 
 This has of course several problems when it comes to general character
 conversion:
@@ -1681,7 +1681,7 @@ This data type is an abstract type defined in @file{iconv.h}.  The user
 must not assume anything about the definition of this type; it must be
 completely opaque.
 
-Objects of this type can get assigned handles for the conversions using
+Objects of this type can be assigned handles for the conversions using
 the @code{iconv} functions.  The objects themselves need not be freed, but
 the conversions for which the handles stand for have to.
 @end deftp
@@ -1716,7 +1716,7 @@ returns @code{(iconv_t) -1}.  In this case the global variable
 @item EMFILE
 The process already has @code{OPEN_MAX} file descriptors open.
 @item ENFILE
-The system limit of open file is reached.
+The system limit of open files is reached.
 @item ENOMEM
 Not enough memory to carry out the operation.
 @item EINVAL
@@ -1778,7 +1778,7 @@ the @code{iconv_open} function.
 
 If the function call was successful the return value is @math{0}.
 Otherwise it is @math{-1} and @code{errno} is set appropriately.
-Defined error are:
+Defined errors are:
 
 @table @code
 @item EBADF
@@ -1847,7 +1847,7 @@ stop is that the output buffer is full.  And the third reason is that
 the input contains invalid characters.
 
 In all of these cases the buffer pointers after the last successful
-conversion, for input and output buffer, are stored in @var{inbuf} and
+conversion, for the input and output buffers, are stored in @var{inbuf} and
 @var{outbuf}, and the available room in each buffer is stored in
 @var{inbytesleft} and @var{outbytesleft}.
 
@@ -2087,7 +2087,7 @@ possibilities.  This does not mean 200 different character sets are
 supported; for example, conversions from one character set to a set of 10
 others might count as 10 conversions.  Together with the other direction
 this makes 20 conversion possibilities used up by one character set.  One
-can imagine the thin coverage these platform provide.  Some Unix vendors
+can imagine the thin coverage these platforms provide.  Some Unix vendors
 even provide only a handful of conversions, which renders them useless for
 almost all uses.
 
@@ -2133,7 +2133,7 @@ will succeed, but how to find @math{@cal{B}}?
 
 Unfortunately, the answer is: there is no general solution.  On some
 systems guessing might help.  On those systems most character sets can
-convert to and from UTF-8 encoded @w{ISO 10646} or Unicode text.  Beside
+convert to and from UTF-8 encoded @w{ISO 10646} or Unicode text.  Besides
 this only some very system-specific methods can help.  Since the
 conversion functions come from loadable modules and these modules must
 be stored somewhere in the filesystem, one @emph{could} try to find them
@@ -2143,7 +2143,7 @@ and whether there is an indirect route from @math{@cal{A}} to
 
 This example shows one of the design errors of @code{iconv} mentioned
 above.  It should at least be possible to determine the list of available
-conversion programmatically so that if @code{iconv_open} says there is no
+conversions programmatically so that if @code{iconv_open} says there is no
 such conversion, one could make sure this also is true for indirect
 routes.
 
@@ -2235,7 +2235,7 @@ achieve the same result as when using the real character set name.
 
 This is quite important as a character set has often many different
 names.  There is normally an official name but this need not correspond to
-the most popular name.  Beside this many character sets have special
+the most popular name.  Besides this many character sets have special
 names that are somehow constructed.  For example, all character sets
 specified by the ISO have an alias of the form @code{ISO-IR-@var{nnn}}
 where @var{nnn} is the registration number.  This allows programs that
@@ -2371,7 +2371,7 @@ itself).
 @itemx const char *__modname
 @itemx int __counter
 All these elements of the structure are used internally in the C library
-to coordinate loading and unloading the shared.  One must not expect any
+to coordinate loading and unloading the shared object.  One must not expect any
 of the other elements to be available or initialized.
 
 @item const char *__from_name
@@ -2438,7 +2438,7 @@ These elements specify the output buffer for the conversion step.  The
 @code{__outbuf} element points to the beginning of the buffer, and
 @code{__outbufend} points to the byte following the last byte in the
 buffer.  The conversion function must not assume anything about the size
-of the buffer but it can be safely assumed the there is room for at
+of the buffer but it can be safely assumed there is room for at
 least one complete character in the output buffer.
 
 Once the conversion is finished, if the conversion is the last step, the
@@ -2673,7 +2673,7 @@ Next, a data structure, which contains the necessary information about
 which conversion is selected, is allocated.  The data structure
 @code{struct iso2022jp_data} is locally defined since, outside the
 module, this data is not used at all.  Please note that if all four
-conversions this modules supports are requested there are four data
+conversions this module supports are requested there are four data
 blocks.
 
 One interesting thing is the initialization of the @code{__min_} and
@@ -2686,7 +2686,7 @@ the conversion from @code{INTERNAL} to ISO-2022-JP we have to take into
 account that escape sequences might be necessary to switch the character
 sets.  Therefore the @code{__max_needed_to} element for this direction
 gets assigned @code{MAX_NEEDED_FROM + 2}.  This takes into account the
-two bytes needed for the escape sequences to single the switching.  The
+two bytes needed for the escape sequences to signal the switching.  The
 asymmetry in the maximum values for the two directions can be explained
 easily: when reading ISO-2022-JP text, escape sequences can be handled
 alone (i.e., it is not necessary to process a real character since the
@@ -2694,7 +2694,7 @@ effect of the escape sequence can be recorded in the state information).
 The situation is different for the other direction.  Since it is in
 general not known which character comes next, one cannot emit escape
 sequences to change the state in advance.  This means the escape
-sequences that have to be emitted together with the next character.
+sequences have to be emitted together with the next character.
 Therefore one needs more room than only for the character itself.
 
 The possible return values of the initialization function are:
@@ -2740,7 +2740,7 @@ conversion function.
 @comment gconv.h
 @comment GNU
 @deftypevr {Data type} int {(*__gconv_fct)} (struct __gconv_step *, struct __gconv_step_data *, const char **, const char *, size_t *, int)
-The conversion function can be called for two basic reason: to convert
+The conversion function can be called for two basic reasons: to convert
 text or to reset the state.  From the description of the @code{iconv}
 function it can be seen why the flushing mode is necessary.  What mode
 is selected is determined by the sixth argument, an integer.  This
@@ -2817,7 +2817,7 @@ therefore will look similar to this:
 But this is not yet all.  Once the function call returns the conversion
 function might have some more to do.  If the return value of the function
 is @code{__GCONV_EMPTY_INPUT}, more room is available in the output
-buffer.  Unless the input buffer is empty the conversion, functions start
+buffer.  Unless the input buffer is empty, the conversion functions start
 all over again and process the rest of the input buffer.  If the return
 value is not @code{__GCONV_EMPTY_INPUT}, something went wrong and we have
 to recover from this.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]