]>
Commit | Line | Data |
---|---|---|
1 | @node Introduction, Error Reporting, Top, Top | |
2 | @chapter Introduction | |
3 | @c %MENU% Purpose of the GNU C Library | |
4 | ||
5 | The C language provides no built-in facilities for performing such | |
6 | common operations as input/output, memory management, string | |
7 | manipulation, and the like. Instead, these facilities are defined | |
8 | in a standard @dfn{library}, which you compile and link with your | |
9 | programs. | |
10 | @cindex library | |
11 | ||
12 | @Theglibc{}, described in this document, defines all of the | |
13 | library functions that are specified by the @w{ISO C} standard, as well as | |
14 | additional features specific to POSIX and other derivatives of the Unix | |
15 | operating system, and extensions specific to @gnusystems{}. | |
16 | ||
17 | The purpose of this manual is to tell you how to use the facilities | |
18 | of @theglibc{}. We have mentioned which features belong to which | |
19 | standards to help you identify things that are potentially non-portable | |
20 | to other systems. But the emphasis in this manual is not on strict | |
21 | portability. | |
22 | ||
23 | @menu | |
24 | * Getting Started:: What this manual is for and how to use it. | |
25 | * Standards and Portability:: Standards and sources upon which the GNU | |
26 | C library is based. | |
27 | * Using the Library:: Some practical uses for the library. | |
28 | * Roadmap to the Manual:: Overview of the remaining chapters in | |
29 | this manual. | |
30 | @end menu | |
31 | ||
32 | @node Getting Started, Standards and Portability, , Introduction | |
33 | @section Getting Started | |
34 | ||
35 | This manual is written with the assumption that you are at least | |
36 | somewhat familiar with the C programming language and basic programming | |
37 | concepts. Specifically, familiarity with ISO standard C | |
38 | (@pxref{ISO C}), rather than ``traditional'' pre-ISO C dialects, is | |
39 | assumed. | |
40 | ||
41 | @Theglibc{} includes several @dfn{header files}, each of which | |
42 | provides definitions and declarations for a group of related facilities; | |
43 | this information is used by the C compiler when processing your program. | |
44 | For example, the header file @file{stdio.h} declares facilities for | |
45 | performing input and output, and the header file @file{string.h} | |
46 | declares string processing utilities. The organization of this manual | |
47 | generally follows the same division as the header files. | |
48 | ||
49 | If you are reading this manual for the first time, you should read all | |
50 | of the introductory material and skim the remaining chapters. There are | |
51 | a @emph{lot} of functions in @theglibc{} and it's not realistic to | |
52 | expect that you will be able to remember exactly @emph{how} to use each | |
53 | and every one of them. It's more important to become generally familiar | |
54 | with the kinds of facilities that the library provides, so that when you | |
55 | are writing your programs you can recognize @emph{when} to make use of | |
56 | library functions, and @emph{where} in this manual you can find more | |
57 | specific information about them. | |
58 | ||
59 | ||
60 | @node Standards and Portability, Using the Library, Getting Started, Introduction | |
61 | @section Standards and Portability | |
62 | @cindex standards | |
63 | ||
64 | This section discusses the various standards and other sources that @theglibc{} | |
65 | is based upon. These sources include the @w{ISO C} and | |
66 | POSIX standards, and the System V and Berkeley Unix implementations. | |
67 | ||
68 | The primary focus of this manual is to tell you how to make effective | |
69 | use of the @glibcadj{} facilities. But if you are concerned about | |
70 | making your programs compatible with these standards, or portable to | |
71 | operating systems other than GNU, this can affect how you use the | |
72 | library. This section gives you an overview of these standards, so that | |
73 | you will know what they are when they are mentioned in other parts of | |
74 | the manual. | |
75 | ||
76 | @xref{Library Summary}, for an alphabetical list of the functions and | |
77 | other symbols provided by the library. This list also states which | |
78 | standards each function or symbol comes from. | |
79 | ||
80 | @menu | |
81 | * ISO C:: The international standard for the C | |
82 | programming language. | |
83 | * POSIX:: The ISO/IEC 9945 (aka IEEE 1003) standards | |
84 | for operating systems. | |
85 | * Berkeley Unix:: BSD and SunOS. | |
86 | * SVID:: The System V Interface Description. | |
87 | * XPG:: The X/Open Portability Guide. | |
88 | * Linux Kernel:: The Linux kernel. | |
89 | @end menu | |
90 | ||
91 | @node ISO C, POSIX, , Standards and Portability | |
92 | @subsection ISO C | |
93 | @cindex ISO C | |
94 | ||
95 | @Theglibc{} is compatible with the C standard adopted by the | |
96 | American National Standards Institute (ANSI): | |
97 | @cite{American National Standard X3.159-1989---``ANSI C''} and later | |
98 | by the International Standardization Organization (ISO): | |
99 | @cite{ISO/IEC 9899:1990, ``Programming languages---C''}. | |
100 | We here refer to the standard as @w{ISO C} since this is the more | |
101 | general standard in respect of ratification. | |
102 | The header files and library facilities that make up @theglibc{} are | |
103 | a superset of those specified by the @w{ISO C} standard. | |
104 | ||
105 | @pindex gcc | |
106 | If you are concerned about strict adherence to the @w{ISO C} standard, you | |
107 | should use the @samp{-ansi} option when you compile your programs with | |
108 | the GNU C compiler. This tells the compiler to define @emph{only} ISO | |
109 | standard features from the library header files, unless you explicitly | |
110 | ask for additional features. @xref{Feature Test Macros}, for | |
111 | information on how to do this. | |
112 | ||
113 | Being able to restrict the library to include only @w{ISO C} features is | |
114 | important because @w{ISO C} puts limitations on what names can be defined | |
115 | by the library implementation, and the GNU extensions don't fit these | |
116 | limitations. @xref{Reserved Names}, for more information about these | |
117 | restrictions. | |
118 | ||
119 | This manual does not attempt to give you complete details on the | |
120 | differences between @w{ISO C} and older dialects. It gives advice on how | |
121 | to write programs to work portably under multiple C dialects, but does | |
122 | not aim for completeness. | |
123 | ||
124 | ||
125 | @node POSIX, Berkeley Unix, ISO C, Standards and Portability | |
126 | @subsection POSIX (The Portable Operating System Interface) | |
127 | @cindex POSIX | |
128 | @cindex POSIX.1 | |
129 | @cindex IEEE Std 1003.1 | |
130 | @cindex ISO/IEC 9945-1 | |
131 | @cindex POSIX.2 | |
132 | @cindex IEEE Std 1003.2 | |
133 | @cindex ISO/IEC 9945-2 | |
134 | ||
135 | @Theglibc{} is also compatible with the ISO @dfn{POSIX} family of | |
136 | standards, known more formally as the @dfn{Portable Operating System | |
137 | Interface for Computer Environments} (ISO/IEC 9945). They were also | |
138 | published as ANSI/IEEE Std 1003. POSIX is derived mostly from various | |
139 | versions of the Unix operating system. | |
140 | ||
141 | The library facilities specified by the POSIX standards are a superset | |
142 | of those required by @w{ISO C}; POSIX specifies additional features for | |
143 | @w{ISO C} functions, as well as specifying new additional functions. In | |
144 | general, the additional requirements and functionality defined by the | |
145 | POSIX standards are aimed at providing lower-level support for a | |
146 | particular kind of operating system environment, rather than general | |
147 | programming language support which can run in many diverse operating | |
148 | system environments. | |
149 | ||
150 | @Theglibc{} implements all of the functions specified in | |
151 | @cite{ISO/IEC 9945-1:1996, the POSIX System Application Program | |
152 | Interface}, commonly referred to as POSIX.1. The primary extensions to | |
153 | the @w{ISO C} facilities specified by this standard include file system | |
154 | interface primitives (@pxref{File System Interface}), device-specific | |
155 | terminal control functions (@pxref{Low-Level Terminal Interface}), and | |
156 | process control functions (@pxref{Processes}). | |
157 | ||
158 | Some facilities from @cite{ISO/IEC 9945-2:1993, the POSIX Shell and | |
159 | Utilities standard} (POSIX.2) are also implemented in @theglibc{}. | |
160 | These include utilities for dealing with regular expressions and other | |
161 | pattern matching facilities (@pxref{Pattern Matching}). | |
162 | ||
163 | @menu | |
164 | * POSIX Safety Concepts:: Safety concepts from POSIX. | |
165 | * Unsafe Features:: Features that make functions unsafe. | |
166 | * Conditionally Safe Features:: Features that make functions unsafe | |
167 | in the absence of workarounds. | |
168 | * Other Safety Remarks:: Additional safety features and remarks. | |
169 | @end menu | |
170 | ||
171 | @comment Roland sez: | |
172 | @comment The GNU C library as it stands conforms to 1003.2 draft 11, which | |
173 | @comment specifies: | |
174 | @comment | |
175 | @comment Several new macros in <limits.h>. | |
176 | @comment popen, pclose | |
177 | @comment <regex.h> (which is not yet fully implemented--wait on this) | |
178 | @comment fnmatch | |
179 | @comment getopt | |
180 | @comment <glob.h> | |
181 | @comment <wordexp.h> (not yet implemented) | |
182 | @comment confstr | |
183 | ||
184 | @node POSIX Safety Concepts, Unsafe Features, , POSIX | |
185 | @subsubsection POSIX Safety Concepts | |
186 | @cindex POSIX Safety Concepts | |
187 | ||
188 | This manual documents various safety properties of @glibcadj{} | |
189 | functions, in lines that follow their prototypes and look like: | |
190 | ||
191 | @sampsafety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
192 | ||
193 | The properties are assessed according to the criteria set forth in the | |
194 | POSIX standard for such safety contexts as Thread-, Async-Signal- and | |
195 | Async-Cancel- -Safety. Intuitive definitions of these properties, | |
196 | attempting to capture the meaning of the standard definitions, follow. | |
197 | ||
198 | @itemize @bullet | |
199 | ||
200 | @item | |
201 | @cindex MT-Safe | |
202 | @cindex Thread-Safe | |
203 | @code{MT-Safe} or Thread-Safe functions are safe to call in the presence | |
204 | of other threads. MT, in MT-Safe, stands for Multi Thread. | |
205 | ||
206 | Being MT-Safe does not imply a function is atomic, nor that it uses any | |
207 | of the memory synchronization mechanisms POSIX exposes to users. It is | |
208 | even possible that calling MT-Safe functions in sequence does not yield | |
209 | an MT-Safe combination. For example, having a thread call two MT-Safe | |
210 | functions one right after the other does not guarantee behavior | |
211 | equivalent to atomic execution of a combination of both functions, since | |
212 | concurrent calls in other threads may interfere in a destructive way. | |
213 | ||
214 | Whole-program optimizations that could inline functions across library | |
215 | interfaces may expose unsafe reordering, and so performing inlining | |
216 | across the @glibcadj{} interface is not recommended. The documented | |
217 | MT-Safety status is not guaranteed under whole-program optimization. | |
218 | However, functions defined in user-visible headers are designed to be | |
219 | safe for inlining. | |
220 | ||
221 | ||
222 | @item | |
223 | @cindex AS-Safe | |
224 | @cindex Async-Signal-Safe | |
225 | @code{AS-Safe} or Async-Signal-Safe functions are safe to call from | |
226 | asynchronous signal handlers. AS, in AS-Safe, stands for Asynchronous | |
227 | Signal. | |
228 | ||
229 | Many functions that are AS-Safe may set @code{errno}, or modify the | |
230 | floating-point environment, because their doing so does not make them | |
231 | unsuitable for use in signal handlers. However, programs could | |
232 | misbehave should asynchronous signal handlers modify this thread-local | |
233 | state, and the signal handling machinery cannot be counted on to | |
234 | preserve it. Therefore, signal handlers that call functions that may | |
235 | set @code{errno} or modify the floating-point environment @emph{must} | |
236 | save their original values, and restore them before returning. | |
237 | ||
238 | ||
239 | @item | |
240 | @cindex AC-Safe | |
241 | @cindex Async-Cancel-Safe | |
242 | @code{AC-Safe} or Async-Cancel-Safe functions are safe to call when | |
243 | asynchronous cancellation is enabled. AC in AC-Safe stands for | |
244 | Asynchronous Cancellation. | |
245 | ||
246 | The POSIX standard defines only three functions to be AC-Safe, namely | |
247 | @code{pthread_cancel}, @code{pthread_setcancelstate}, and | |
248 | @code{pthread_setcanceltype}. At present @theglibc{} provides no | |
249 | guarantees beyond these three functions, but does document which | |
250 | functions are presently AC-Safe. This documentation is provided for use | |
251 | by @theglibc{} developers. | |
252 | ||
253 | Just like signal handlers, cancellation cleanup routines must configure | |
254 | the floating point environment they require. The routines cannot assume | |
255 | a floating point environment, particularly when asynchronous | |
256 | cancellation is enabled. If the configuration of the floating point | |
257 | environment cannot be performed atomically then it is also possible that | |
258 | the environment encountered is internally inconsistent. | |
259 | ||
260 | ||
261 | @item | |
262 | @cindex MT-Unsafe | |
263 | @cindex Thread-Unsafe | |
264 | @cindex AS-Unsafe | |
265 | @cindex Async-Signal-Unsafe | |
266 | @cindex AC-Unsafe | |
267 | @cindex Async-Cancel-Unsafe | |
268 | @code{MT-Unsafe}, @code{AS-Unsafe}, @code{AC-Unsafe} functions are not | |
269 | safe to call within the safety contexts described above. Calling them | |
270 | within such contexts invokes undefined behavior. | |
271 | ||
272 | Functions not explicitly documented as safe in a safety context should | |
273 | be regarded as Unsafe. | |
274 | ||
275 | ||
276 | @item | |
277 | @cindex Preliminary | |
278 | @code{Preliminary} safety properties are documented, indicating these | |
279 | properties may @emph{not} be counted on in future releases of | |
280 | @theglibc{}. | |
281 | ||
282 | Such preliminary properties are the result of an assessment of the | |
283 | properties of our current implementation, rather than of what is | |
284 | mandated and permitted by current and future standards. | |
285 | ||
286 | Although we strive to abide by the standards, in some cases our | |
287 | implementation is safe even when the standard does not demand safety, | |
288 | and in other cases our implementation does not meet the standard safety | |
289 | requirements. The latter are most likely bugs; the former, when marked | |
290 | as @code{Preliminary}, should not be counted on: future standards may | |
291 | require changes that are not compatible with the additional safety | |
292 | properties afforded by the current implementation. | |
293 | ||
294 | Furthermore, the POSIX standard does not offer a detailed definition of | |
295 | safety. We assume that, by ``safe to call'', POSIX means that, as long | |
296 | as the program does not invoke undefined behavior, the ``safe to call'' | |
297 | function behaves as specified, and does not cause other functions to | |
298 | deviate from their specified behavior. We have chosen to use its loose | |
299 | definitions of safety, not because they are the best definitions to use, | |
300 | but because choosing them harmonizes this manual with POSIX. | |
301 | ||
302 | Please keep in mind that these are preliminary definitions and | |
303 | annotations, and certain aspects of the definitions are still under | |
304 | discussion and might be subject to clarification or change. | |
305 | ||
306 | Over time, we envision evolving the preliminary safety notes into stable | |
307 | commitments, as stable as those of our interfaces. As we do, we will | |
308 | remove the @code{Preliminary} keyword from safety notes. As long as the | |
309 | keyword remains, however, they are not to be regarded as a promise of | |
310 | future behavior. | |
311 | ||
312 | ||
313 | @end itemize | |
314 | ||
315 | Other keywords that appear in safety notes are defined in subsequent | |
316 | sections. | |
317 | ||
318 | ||
319 | @node Unsafe Features, Conditionally Safe Features, POSIX Safety Concepts, POSIX | |
320 | @subsubsection Unsafe Features | |
321 | @cindex Unsafe Features | |
322 | ||
323 | Functions that are unsafe to call in certain contexts are annotated with | |
324 | keywords that document their features that make them unsafe to call. | |
325 | AS-Unsafe features in this section indicate the functions are never safe | |
326 | to call when asynchronous signals are enabled. AC-Unsafe features | |
327 | indicate they are never safe to call when asynchronous cancellation is | |
328 | enabled. There are no MT-Unsafe marks in this section. | |
329 | ||
330 | @itemize @bullet | |
331 | ||
332 | @item @code{lock} | |
333 | @cindex lock | |
334 | ||
335 | Functions marked with @code{lock} as an AS-Unsafe feature may be | |
336 | interrupted by a signal while holding a non-recursive lock. If the | |
337 | signal handler calls another such function that takes the same lock, the | |
338 | result is a deadlock. | |
339 | ||
340 | Functions annotated with @code{lock} as an AC-Unsafe feature may, if | |
341 | cancelled asynchronously, fail to release a lock that would have been | |
342 | released if their execution had not been interrupted by asynchronous | |
343 | thread cancellation. Once a lock is left taken, attempts to take that | |
344 | lock will block indefinitely. | |
345 | ||
346 | ||
347 | @item @code{corrupt} | |
348 | @cindex corrupt | |
349 | ||
350 | Functions marked with @code{corrupt} as an AS-Unsafe feature may corrupt | |
351 | data structures and misbehave when they interrupt, or are interrupted | |
352 | by, another such function. Unlike functions marked with @code{lock}, | |
353 | these take recursive locks to avoid MT-Safety problems, but this is not | |
354 | enough to stop a signal handler from observing a partially-updated data | |
355 | structure. Further corruption may arise from the interrupted function's | |
356 | failure to notice updates made by signal handlers. | |
357 | ||
358 | Functions marked with @code{corrupt} as an AC-Unsafe feature may leave | |
359 | data structures in a corrupt, partially updated state. Subsequent uses | |
360 | of the data structure may misbehave. | |
361 | ||
362 | @c A special case, probably not worth documenting separately, involves | |
363 | @c reallocing, or even freeing pointers. Any case involving free could | |
364 | @c be easily turned into an ac-safe leak by resetting the pointer before | |
365 | @c releasing it; I don't think we have any case that calls for this sort | |
366 | @c of fixing. Fixing the realloc cases would require a new interface: | |
367 | @c instead of @code{ptr=realloc(ptr,size)} we'd have to introduce | |
368 | @c @code{acsafe_realloc(&ptr,size)} that would modify ptr before | |
369 | @c releasing the old memory. The ac-unsafe realloc could be implemented | |
370 | @c in terms of an internal interface with this semantics (say | |
371 | @c __acsafe_realloc), but since realloc can be overridden, the function | |
372 | @c we call to implement realloc should not be this internal interface, | |
373 | @c but another internal interface that calls __acsafe_realloc if realloc | |
374 | @c was not overridden, and calls the overridden realloc with async | |
375 | @c cancel disabled. --lxoliva | |
376 | ||
377 | ||
378 | @item @code{heap} | |
379 | @cindex heap | |
380 | ||
381 | Functions marked with @code{heap} may call heap memory management | |
382 | functions from the @code{malloc}/@code{free} family of functions and are | |
383 | only as safe as those functions. This note is thus equivalent to: | |
384 | ||
385 | @sampsafety{@asunsafe{@asulock{}}@acunsafe{@aculock{} @acsfd{} @acsmem{}}} | |
386 | ||
387 | ||
388 | @c Check for cases that should have used plugin instead of or in | |
389 | @c addition to this. Then, after rechecking gettext, adjust i18n if | |
390 | @c needed. | |
391 | @item @code{dlopen} | |
392 | @cindex dlopen | |
393 | ||
394 | Functions marked with @code{dlopen} use the dynamic loader to load | |
395 | shared libraries into the current execution image. This involves | |
396 | opening files, mapping them into memory, allocating additional memory, | |
397 | resolving symbols, applying relocations and more, all of this while | |
398 | holding internal dynamic loader locks. | |
399 | ||
400 | The locks are enough for these functions to be AS- and AC-Unsafe, but | |
401 | other issues may arise. At present this is a placeholder for all | |
402 | potential safety issues raised by @code{dlopen}. | |
403 | ||
404 | @c dlopen runs init and fini sections of the module; does this mean | |
405 | @c dlopen always implies plugin? | |
406 | ||
407 | ||
408 | @item @code{plugin} | |
409 | @cindex plugin | |
410 | ||
411 | Functions annotated with @code{plugin} may run code from plugins that | |
412 | may be external to @theglibc{}. Such plugin functions are assumed to be | |
413 | MT-Safe, AS-Unsafe and AC-Unsafe. Examples of such plugins are stack | |
414 | @cindex NSS | |
415 | unwinding libraries, name service switch (NSS) and character set | |
416 | @cindex iconv | |
417 | conversion (iconv) back-ends. | |
418 | ||
419 | Although the plugins mentioned as examples are all brought in by means | |
420 | of dlopen, the @code{plugin} keyword does not imply any direct | |
421 | involvement of the dynamic loader or the @code{libdl} interfaces, those | |
422 | are covered by @code{dlopen}. For example, if one function loads a | |
423 | module and finds the addresses of some of its functions, while another | |
424 | just calls those already-resolved functions, the former will be marked | |
425 | with @code{dlopen}, whereas the latter will get the @code{plugin}. When | |
426 | a single function takes all of these actions, then it gets both marks. | |
427 | ||
428 | ||
429 | @item @code{i18n} | |
430 | @cindex i18n | |
431 | ||
432 | Functions marked with @code{i18n} may call internationalization | |
433 | functions of the @code{gettext} family and will be only as safe as those | |
434 | functions. This note is thus equivalent to: | |
435 | ||
436 | @sampsafety{@mtsafe{@mtsenv{}}@asunsafe{@asucorrupt{} @ascuheap{} @ascudlopen{}}@acunsafe{@acucorrupt{}}} | |
437 | ||
438 | ||
439 | @item @code{timer} | |
440 | @cindex timer | |
441 | ||
442 | Functions marked with @code{timer} use the @code{alarm} function or | |
443 | similar to set a time-out for a system call or a long-running operation. | |
444 | In a multi-threaded program, there is a risk that the time-out signal | |
445 | will be delivered to a different thread, thus failing to interrupt the | |
446 | intended thread. Besides being MT-Unsafe, such functions are always | |
447 | AS-Unsafe, because calling them in signal handlers may interfere with | |
448 | timers set in the interrupted code, and AC-Unsafe, because there is no | |
449 | safe way to guarantee an earlier timer will be reset in case of | |
450 | asynchronous cancellation. | |
451 | ||
452 | @end itemize | |
453 | ||
454 | ||
455 | @node Conditionally Safe Features, Other Safety Remarks, Unsafe Features, POSIX | |
456 | @subsubsection Conditionally Safe Features | |
457 | @cindex Conditionally Safe Features | |
458 | ||
459 | For some features that make functions unsafe to call in certain | |
460 | contexts, there are known ways to avoid the safety problem other than | |
461 | refraining from calling the function altogether. The keywords that | |
462 | follow refer to such features, and each of their definitions indicate | |
463 | how the whole program needs to be constrained in order to remove the | |
464 | safety problem indicated by the keyword. Only when all the reasons that | |
465 | make a function unsafe are observed and addressed, by applying the | |
466 | documented constraints, does the function become safe to call in a | |
467 | context. | |
468 | ||
469 | @itemize @bullet | |
470 | ||
471 | @item @code{init} | |
472 | @cindex init | |
473 | ||
474 | Functions marked with @code{init} as an MT-Unsafe feature perform | |
475 | MT-Unsafe initialization when they are first called. | |
476 | ||
477 | Calling such a function at least once in single-threaded mode removes | |
478 | this specific cause for the function to be regarded as MT-Unsafe. If no | |
479 | other cause for that remains, the function can then be safely called | |
480 | after other threads are started. | |
481 | ||
482 | Functions marked with @code{init} as an AS- or AC-Unsafe feature use the | |
483 | internal @code{libc_once} machinery or similar to initialize internal | |
484 | data structures. | |
485 | ||
486 | If a signal handler interrupts such an initializer, and calls any | |
487 | function that also performs @code{libc_once} initialization, it will | |
488 | deadlock if the thread library has been loaded. | |
489 | ||
490 | Furthermore, if an initializer is partially complete before it is | |
491 | canceled or interrupted by a signal whose handler requires the same | |
492 | initialization, some or all of the initialization may be performed more | |
493 | than once, leaking resources or even resulting in corrupt internal data. | |
494 | ||
495 | Applications that need to call functions marked with @code{init} as an | |
496 | AS- or AC-Unsafe feature should ensure the initialization is performed | |
497 | before configuring signal handlers or enabling cancellation, so that the | |
498 | AS- and AC-Safety issues related with @code{libc_once} do not arise. | |
499 | ||
500 | @c We may have to extend the annotations to cover conditions in which | |
501 | @c initialization may or may not occur, since an initial call in a safe | |
502 | @c context is no use if the initialization doesn't take place at that | |
503 | @c time: it doesn't remove the risk for later calls. | |
504 | ||
505 | ||
506 | @item @code{race} | |
507 | @cindex race | |
508 | ||
509 | Functions annotated with @code{race} as an MT-Safety issue operate on | |
510 | objects in ways that may cause data races or similar forms of | |
511 | destructive interference out of concurrent execution. In some cases, | |
512 | the objects are passed to the functions by users; in others, they are | |
513 | used by the functions to return values to users; in others, they are not | |
514 | even exposed to users. | |
515 | ||
516 | We consider access to objects passed as (indirect) arguments to | |
517 | functions to be data race free. The assurance of data race free objects | |
518 | is the caller's responsibility. We will not mark a function as | |
519 | MT-Unsafe or AS-Unsafe if it misbehaves when users fail to take the | |
520 | measures required by POSIX to avoid data races when dealing with such | |
521 | objects. As a general rule, if a function is documented as reading from | |
522 | an object passed (by reference) to it, or modifying it, users ought to | |
523 | use memory synchronization primitives to avoid data races just as they | |
524 | would should they perform the accesses themselves rather than by calling | |
525 | the library function. @code{FILE} streams are the exception to the | |
526 | general rule, in that POSIX mandates the library to guard against data | |
527 | races in many functions that manipulate objects of this specific opaque | |
528 | type. We regard this as a convenience provided to users, rather than as | |
529 | a general requirement whose expectations should extend to other types. | |
530 | ||
531 | In order to remind users that guarding certain arguments is their | |
532 | responsibility, we will annotate functions that take objects of certain | |
533 | types as arguments. We draw the line for objects passed by users as | |
534 | follows: objects whose types are exposed to users, and that users are | |
535 | expected to access directly, such as memory buffers, strings, and | |
536 | various user-visible @code{struct} types, do @emph{not} give reason for | |
537 | functions to be annotated with @code{race}. It would be noisy and | |
538 | redundant with the general requirement, and not many would be surprised | |
539 | by the library's lack of internal guards when accessing objects that can | |
540 | be accessed directly by users. | |
541 | ||
542 | As for objects that are opaque or opaque-like, in that they are to be | |
543 | manipulated only by passing them to library functions (e.g., | |
544 | @code{FILE}, @code{DIR}, @code{obstack}, @code{iconv_t}), there might be | |
545 | additional expectations as to internal coordination of access by the | |
546 | library. We will annotate, with @code{race} followed by a colon and the | |
547 | argument name, functions that take such objects but that do not take | |
548 | care of synchronizing access to them by default. For example, | |
549 | @code{FILE} stream @code{unlocked} functions will be annotated, but | |
550 | those that perform implicit locking on @code{FILE} streams by default | |
551 | will not, even though the implicit locking may be disabled on a | |
552 | per-stream basis. | |
553 | ||
554 | In either case, we will not regard as MT-Unsafe functions that may | |
555 | access user-supplied objects in unsafe ways should users fail to ensure | |
556 | the accesses are well defined. The notion prevails that users are | |
557 | expected to safeguard against data races any user-supplied objects that | |
558 | the library accesses on their behalf. | |
559 | ||
560 | @c The above describes @mtsrace; @mtasurace is described below. | |
561 | ||
562 | This user responsibility does not apply, however, to objects controlled | |
563 | by the library itself, such as internal objects and static buffers used | |
564 | to return values from certain calls. When the library doesn't guard | |
565 | them against concurrent uses, these cases are regarded as MT-Unsafe and | |
566 | AS-Unsafe (although the @code{race} mark under AS-Unsafe will be omitted | |
567 | as redundant with the one under MT-Unsafe). As in the case of | |
568 | user-exposed objects, the mark may be followed by a colon and an | |
569 | identifier. The identifier groups all functions that operate on a | |
570 | certain unguarded object; users may avoid the MT-Safety issues related | |
571 | with unguarded concurrent access to such internal objects by creating a | |
572 | non-recursive mutex related with the identifier, and always holding the | |
573 | mutex when calling any function marked as racy on that identifier, as | |
574 | they would have to should the identifier be an object under user | |
575 | control. The non-recursive mutex avoids the MT-Safety issue, but it | |
576 | trades one AS-Safety issue for another, so use in asynchronous signals | |
577 | remains undefined. | |
578 | ||
579 | When the identifier relates to a static buffer used to hold return | |
580 | values, the mutex must be held for as long as the buffer remains in use | |
581 | by the caller. Many functions that return pointers to static buffers | |
582 | offer reentrant variants that store return values in caller-supplied | |
583 | buffers instead. In some cases, such as @code{tmpname}, the variant is | |
584 | chosen not by calling an alternate entry point, but by passing a | |
585 | non-@code{NULL} pointer to the buffer in which the returned values are | |
586 | to be stored. These variants are generally preferable in multi-threaded | |
587 | programs, although some of them are not MT-Safe because of other | |
588 | internal buffers, also documented with @code{race} notes. | |
589 | ||
590 | ||
591 | @item @code{const} | |
592 | @cindex const | |
593 | ||
594 | Functions marked with @code{const} as an MT-Safety issue non-atomically | |
595 | modify internal objects that are better regarded as constant, because a | |
596 | substantial portion of @theglibc{} accesses them without | |
597 | synchronization. Unlike @code{race}, that causes both readers and | |
598 | writers of internal objects to be regarded as MT-Unsafe and AS-Unsafe, | |
599 | this mark is applied to writers only. Writers remain equally MT- and | |
600 | AS-Unsafe to call, but the then-mandatory constness of objects they | |
601 | modify enables readers to be regarded as MT-Safe and AS-Safe (as long as | |
602 | no other reasons for them to be unsafe remain), since the lack of | |
603 | synchronization is not a problem when the objects are effectively | |
604 | constant. | |
605 | ||
606 | The identifier that follows the @code{const} mark will appear by itself | |
607 | as a safety note in readers. Programs that wish to work around this | |
608 | safety issue, so as to call writers, may use a non-recursve | |
609 | @code{rwlock} associated with the identifier, and guard @emph{all} calls | |
610 | to functions marked with @code{const} followed by the identifier with a | |
611 | write lock, and @emph{all} calls to functions marked with the identifier | |
612 | by itself with a read lock. The non-recursive locking removes the | |
613 | MT-Safety problem, but it trades one AS-Safety problem for another, so | |
614 | use in asynchronous signals remains undefined. | |
615 | ||
616 | @c But what if, instead of marking modifiers with const:id and readers | |
617 | @c with just id, we marked writers with race:id and readers with ro:id? | |
618 | @c Instead of having to define each instance of “id”, we'd have a | |
619 | @c general pattern governing all such “id”s, wherein race:id would | |
620 | @c suggest the need for an exclusive/write lock to make the function | |
621 | @c safe, whereas ro:id would indicate “id” is expected to be read-only, | |
622 | @c but if any modifiers are called (while holding an exclusive lock), | |
623 | @c then ro:id-marked functions ought to be guarded with a read lock for | |
624 | @c safe operation. ro:env or ro:locale, for example, seems to convey | |
625 | @c more clearly the expectations and the meaning, than just env or | |
626 | @c locale. | |
627 | ||
628 | ||
629 | @item @code{sig} | |
630 | @cindex sig | |
631 | ||
632 | Functions marked with @code{sig} as a MT-Safety issue (that implies an | |
633 | identical AS-Safety issue, omitted for brevity) may temporarily install | |
634 | a signal handler for internal purposes, which may interfere with other | |
635 | uses of the signal, identified after a colon. | |
636 | ||
637 | This safety problem can be worked around by ensuring that no other uses | |
638 | of the signal will take place for the duration of the call. Holding a | |
639 | non-recursive mutex while calling all functions that use the same | |
640 | temporary signal; blocking that signal before the call and resetting its | |
641 | handler afterwards is recommended. | |
642 | ||
643 | There is no safe way to guarantee the original signal handler is | |
644 | restored in case of asynchronous cancellation, therefore so-marked | |
645 | functions are also AC-Unsafe. | |
646 | ||
647 | @c fixme: at least deferred cancellation should get it right, and would | |
648 | @c obviate the restoring bit below, and the qualifier above. | |
649 | ||
650 | Besides the measures recommended to work around the MT- and AS-Safety | |
651 | problem, in order to avert the cancellation problem, disabling | |
652 | asynchronous cancellation @emph{and} installing a cleanup handler to | |
653 | restore the signal to the desired state and to release the mutex are | |
654 | recommended. | |
655 | ||
656 | ||
657 | @item @code{term} | |
658 | @cindex term | |
659 | ||
660 | Functions marked with @code{term} as an MT-Safety issue may change the | |
661 | terminal settings in the recommended way, namely: call @code{tcgetattr}, | |
662 | modify some flags, and then call @code{tcsetattr}; this creates a window | |
663 | in which changes made by other threads are lost. Thus, functions marked | |
664 | with @code{term} are MT-Unsafe. The same window enables changes made by | |
665 | asynchronous signals to be lost. These functions are also AS-Unsafe, | |
666 | but the corresponding mark is omitted as redundant. | |
667 | ||
668 | It is thus advisable for applications using the terminal to avoid | |
669 | concurrent and reentrant interactions with it, by not using it in signal | |
670 | handlers or blocking signals that might use it, and holding a lock while | |
671 | calling these functions and interacting with the terminal. This lock | |
672 | should also be used for mutual exclusion with functions marked with | |
673 | @code{@mtasurace{:tcattr(fd)}}, where @var{fd} is a file descriptor for | |
674 | the controlling terminal. The caller may use a single mutex for | |
675 | simplicity, or use one mutex per terminal, even if referenced by | |
676 | different file descriptors. | |
677 | ||
678 | Functions marked with @code{term} as an AC-Safety issue are supposed to | |
679 | restore terminal settings to their original state, after temporarily | |
680 | changing them, but they may fail to do so if cancelled. | |
681 | ||
682 | @c fixme: at least deferred cancellation should get it right, and would | |
683 | @c obviate the restoring bit below, and the qualifier above. | |
684 | ||
685 | Besides the measures recommended to work around the MT- and AS-Safety | |
686 | problem, in order to avert the cancellation problem, disabling | |
687 | asynchronous cancellation @emph{and} installing a cleanup handler to | |
688 | restore the terminal settings to the original state and to release the | |
689 | mutex are recommended. | |
690 | ||
691 | ||
692 | @end itemize | |
693 | ||
694 | ||
695 | @node Other Safety Remarks, , Conditionally Safe Features, POSIX | |
696 | @subsubsection Other Safety Remarks | |
697 | @cindex Other Safety Remarks | |
698 | ||
699 | Additional keywords may be attached to functions, indicating features | |
700 | that do not make a function unsafe to call, but that may need to be | |
701 | taken into account in certain classes of programs: | |
702 | ||
703 | @itemize @bullet | |
704 | ||
705 | @item @code{locale} | |
706 | @cindex locale | |
707 | ||
708 | Functions annotated with @code{locale} as an MT-Safety issue read from | |
709 | the locale object without any form of synchronization. Functions | |
710 | annotated with @code{locale} called concurrently with locale changes may | |
711 | behave in ways that do not correspond to any of the locales active | |
712 | during their execution, but an unpredictable mix thereof. | |
713 | ||
714 | We do not mark these functions as MT- or AS-Unsafe, however, because | |
715 | functions that modify the locale object are marked with | |
716 | @code{const:locale} and regarded as unsafe. Being unsafe, the latter | |
717 | are not to be called when multiple threads are running or asynchronous | |
718 | signals are enabled, and so the locale can be considered effectively | |
719 | constant in these contexts, which makes the former safe. | |
720 | ||
721 | @c Should the locking strategy suggested under @code{const} be used, | |
722 | @c failure to guard locale uses is not as fatal as data races in | |
723 | @c general: unguarded uses will @emph{not} follow dangling pointers or | |
724 | @c access uninitialized, unmapped or recycled memory. Each access will | |
725 | @c read from a consistent locale object that is or was active at some | |
726 | @c point during its execution. Without synchronization, however, it | |
727 | @c cannot even be assumed that, after a change in locale, earlier | |
728 | @c locales will no longer be used, even after the newly-chosen one is | |
729 | @c used in the thread. Nevertheless, even though unguarded reads from | |
730 | @c the locale will not violate type safety, functions that access the | |
731 | @c locale multiple times may invoke all sorts of undefined behavior | |
732 | @c because of the unexpected locale changes. | |
733 | ||
734 | ||
735 | @item @code{env} | |
736 | @cindex env | |
737 | ||
738 | Functions marked with @code{env} as an MT-Safety issue access the | |
739 | environment with @code{getenv} or similar, without any guards to ensure | |
740 | safety in the presence of concurrent modifications. | |
741 | ||
742 | We do not mark these functions as MT- or AS-Unsafe, however, because | |
743 | functions that modify the environment are all marked with | |
744 | @code{const:env} and regarded as unsafe. Being unsafe, the latter are | |
745 | not to be called when multiple threads are running or asynchronous | |
746 | signals are enabled, and so the environment can be considered | |
747 | effectively constant in these contexts, which makes the former safe. | |
748 | ||
749 | ||
750 | @item @code{hostid} | |
751 | @cindex hostid | |
752 | ||
753 | The function marked with @code{hostid} as an MT-Safety issue reads from | |
754 | the system-wide data structures that hold the ``host ID'' of the | |
755 | machine. These data structures cannot generally be modified atomically. | |
756 | Since it is expected that the ``host ID'' will not normally change, the | |
757 | function that reads from it (@code{gethostid}) is regarded as safe, | |
758 | whereas the function that modifies it (@code{sethostid}) is marked with | |
759 | @code{@mtasuconst{:@mtshostid{}}}, indicating it may require special | |
760 | care if it is to be called. In this specific case, the special care | |
761 | amounts to system-wide (not merely intra-process) coordination. | |
762 | ||
763 | ||
764 | @item @code{sigintr} | |
765 | @cindex sigintr | |
766 | ||
767 | Functions marked with @code{sigintr} as an MT-Safety issue access the | |
768 | @code{_sigintr} internal data structure without any guards to ensure | |
769 | safety in the presence of concurrent modifications. | |
770 | ||
771 | We do not mark these functions as MT- or AS-Unsafe, however, because | |
772 | functions that modify the this data structure are all marked with | |
773 | @code{const:sigintr} and regarded as unsafe. Being unsafe, the latter | |
774 | are not to be called when multiple threads are running or asynchronous | |
775 | signals are enabled, and so the data structure can be considered | |
776 | effectively constant in these contexts, which makes the former safe. | |
777 | ||
778 | ||
779 | @item @code{fd} | |
780 | @cindex fd | |
781 | ||
782 | Functions annotated with @code{fd} as an AC-Safety issue may leak file | |
783 | descriptors if asynchronous thread cancellation interrupts their | |
784 | execution. | |
785 | ||
786 | Functions that allocate or deallocate file descriptors will generally be | |
787 | marked as such. Even if they attempted to protect the file descriptor | |
788 | allocation and deallocation with cleanup regions, allocating a new | |
789 | descriptor and storing its number where the cleanup region could release | |
790 | it cannot be performed as a single atomic operation. Similarly, | |
791 | releasing the descriptor and taking it out of the data structure | |
792 | normally responsible for releasing it cannot be performed atomically. | |
793 | There will always be a window in which the descriptor cannot be released | |
794 | because it was not stored in the cleanup handler argument yet, or it was | |
795 | already taken out before releasing it. It cannot be taken out after | |
796 | release: an open descriptor could mean either that the descriptor still | |
797 | has to be closed, or that it already did so but the descriptor was | |
798 | reallocated by another thread or signal handler. | |
799 | ||
800 | Such leaks could be internally avoided, with some performance penalty, | |
801 | by temporarily disabling asynchronous thread cancellation. However, | |
802 | since callers of allocation or deallocation functions would have to do | |
803 | this themselves, to avoid the same sort of leak in their own layer, it | |
804 | makes more sense for the library to assume they are taking care of it | |
805 | than to impose a performance penalty that is redundant when the problem | |
806 | is solved in upper layers, and insufficient when it is not. | |
807 | ||
808 | This remark by itself does not cause a function to be regarded as | |
809 | AC-Unsafe. However, cumulative effects of such leaks may pose a | |
810 | problem for some programs. If this is the case, suspending asynchronous | |
811 | cancellation for the duration of calls to such functions is recommended. | |
812 | ||
813 | ||
814 | @item @code{mem} | |
815 | @cindex mem | |
816 | ||
817 | Functions annotated with @code{mem} as an AC-Safety issue may leak | |
818 | memory if asynchronous thread cancellation interrupts their execution. | |
819 | ||
820 | The problem is similar to that of file descriptors: there is no atomic | |
821 | interface to allocate memory and store its address in the argument to a | |
822 | cleanup handler, or to release it and remove its address from that | |
823 | argument, without at least temporarily disabling asynchronous | |
824 | cancellation, which these functions do not do. | |
825 | ||
826 | This remark does not by itself cause a function to be regarded as | |
827 | generally AC-Unsafe. However, cumulative effects of such leaks may be | |
828 | severe enough for some programs that disabling asynchronous cancellation | |
829 | for the duration of calls to such functions may be required. | |
830 | ||
831 | ||
832 | @item @code{cwd} | |
833 | @cindex cwd | |
834 | ||
835 | Functions marked with @code{cwd} as an MT-Safety issue may temporarily | |
836 | change the current working directory during their execution, which may | |
837 | cause relative pathnames to be resolved in unexpected ways in other | |
838 | threads or within asynchronous signal or cancellation handlers. | |
839 | ||
840 | This is not enough of a reason to mark so-marked functions as MT- or | |
841 | AS-Unsafe, but when this behavior is optional (e.g., @code{nftw} with | |
842 | @code{FTW_CHDIR}), avoiding the option may be a good alternative to | |
843 | using full pathnames or file descriptor-relative (e.g. @code{openat}) | |
844 | system calls. | |
845 | ||
846 | ||
847 | @item @code{!posix} | |
848 | @cindex !posix | |
849 | ||
850 | This remark, as an MT-, AS- or AC-Safety note to a function, indicates | |
851 | the safety status of the function is known to differ from the specified | |
852 | status in the POSIX standard. For example, POSIX does not require a | |
853 | function to be Safe, but our implementation is, or vice-versa. | |
854 | ||
855 | For the time being, the absence of this remark does not imply the safety | |
856 | properties we documented are identical to those mandated by POSIX for | |
857 | the corresponding functions. | |
858 | ||
859 | ||
860 | @item @code{:identifier} | |
861 | @cindex :identifier | |
862 | ||
863 | Annotations may sometimes be followed by identifiers, intended to group | |
864 | several functions that e.g. access the data structures in an unsafe way, | |
865 | as in @code{race} and @code{const}, or to provide more specific | |
866 | information, such as naming a signal in a function marked with | |
867 | @code{sig}. It is envisioned that it may be applied to @code{lock} and | |
868 | @code{corrupt} as well in the future. | |
869 | ||
870 | In most cases, the identifier will name a set of functions, but it may | |
871 | name global objects or function arguments, or identifiable properties or | |
872 | logical components associated with them, with a notation such as | |
873 | e.g. @code{:buf(arg)} to denote a buffer associated with the argument | |
874 | @var{arg}, or @code{:tcattr(fd)} to denote the terminal attributes of a | |
875 | file descriptor @var{fd}. | |
876 | ||
877 | The most common use for identifiers is to provide logical groups of | |
878 | functions and arguments that need to be protected by the same | |
879 | synchronization primitive in order to ensure safe operation in a given | |
880 | context. | |
881 | ||
882 | ||
883 | @item @code{/condition} | |
884 | @cindex /condition | |
885 | ||
886 | Some safety annotations may be conditional, in that they only apply if a | |
887 | boolean expression involving arguments, global variables or even the | |
888 | underlying kernel evaluates to true. Such conditions as | |
889 | @code{/hurd} or @code{/!linux!bsd} indicate the preceding marker only | |
890 | applies when the underlying kernel is the HURD, or when it is neither | |
891 | Linux nor a BSD kernel, respectively. @code{/!ps} and | |
892 | @code{/one_per_line} indicate the preceding marker only applies when | |
893 | argument @var{ps} is NULL, or global variable @var{one_per_line} is | |
894 | nonzero. | |
895 | ||
896 | When all marks that render a function unsafe are adorned with such | |
897 | conditions, and none of the named conditions hold, then the function can | |
898 | be regarded as safe. | |
899 | ||
900 | ||
901 | @end itemize | |
902 | ||
903 | ||
904 | @node Berkeley Unix, SVID, POSIX, Standards and Portability | |
905 | @subsection Berkeley Unix | |
906 | @cindex BSD Unix | |
907 | @cindex 4.@var{n} BSD Unix | |
908 | @cindex Berkeley Unix | |
909 | @cindex SunOS | |
910 | @cindex Unix, Berkeley | |
911 | ||
912 | @Theglibc{} defines facilities from some versions of Unix which | |
913 | are not formally standardized, specifically from the 4.2 BSD, 4.3 BSD, | |
914 | and 4.4 BSD Unix systems (also known as @dfn{Berkeley Unix}) and from | |
915 | @dfn{SunOS} (a popular 4.2 BSD derivative that includes some Unix System | |
916 | V functionality). These systems support most of the @w{ISO C} and POSIX | |
917 | facilities, and 4.4 BSD and newer releases of SunOS in fact support them all. | |
918 | ||
919 | The BSD facilities include symbolic links (@pxref{Symbolic Links}), the | |
920 | @code{select} function (@pxref{Waiting for I/O}), the BSD signal | |
921 | functions (@pxref{BSD Signal Handling}), and sockets (@pxref{Sockets}). | |
922 | ||
923 | @node SVID, XPG, Berkeley Unix, Standards and Portability | |
924 | @subsection SVID (The System V Interface Description) | |
925 | @cindex SVID | |
926 | @cindex System V Unix | |
927 | @cindex Unix, System V | |
928 | ||
929 | The @dfn{System V Interface Description} (SVID) is a document describing | |
930 | the AT&T Unix System V operating system. It is to some extent a | |
931 | superset of the POSIX standard (@pxref{POSIX}). | |
932 | ||
933 | @Theglibc{} defines most of the facilities required by the SVID | |
934 | that are not also required by the @w{ISO C} or POSIX standards, for | |
935 | compatibility with System V Unix and other Unix systems (such as | |
936 | SunOS) which include these facilities. However, many of the more | |
937 | obscure and less generally useful facilities required by the SVID are | |
938 | not included. (In fact, Unix System V itself does not provide them all.) | |
939 | ||
940 | The supported facilities from System V include the methods for | |
941 | inter-process communication and shared memory, the @code{hsearch} and | |
942 | @code{drand48} families of functions, @code{fmtmsg} and several of the | |
943 | mathematical functions. | |
944 | ||
945 | @node XPG, Linux Kernel, SVID, Standards and Portability | |
946 | @subsection XPG (The X/Open Portability Guide) | |
947 | ||
948 | The X/Open Portability Guide, published by the X/Open Company, Ltd., is | |
949 | a more general standard than POSIX. X/Open owns the Unix copyright and | |
950 | the XPG specifies the requirements for systems which are intended to be | |
951 | a Unix system. | |
952 | ||
953 | @Theglibc{} complies to the X/Open Portability Guide, Issue 4.2, | |
954 | with all extensions common to XSI (X/Open System Interface) | |
955 | compliant systems and also all X/Open UNIX extensions. | |
956 | ||
957 | The additions on top of POSIX are mainly derived from functionality | |
958 | available in @w{System V} and BSD systems. Some of the really bad | |
959 | mistakes in @w{System V} systems were corrected, though. Since | |
960 | fulfilling the XPG standard with the Unix extensions is a | |
961 | precondition for getting the Unix brand chances are good that the | |
962 | functionality is available on commercial systems. | |
963 | ||
964 | @node Linux Kernel, , XPG, Standards and Portability | |
965 | @subsection Linux (The Linux Kernel) | |
966 | ||
967 | @Theglibc{} includes by reference the Linux man-pages | |
968 | @value{man_pages_version} documentation to document the listed | |
969 | syscalls for the Linux kernel. For reference purposes only the latest | |
970 | @uref{https://www.kernel.org/doc/man-pages/,Linux man-pages Project} | |
971 | documentation can be accessed from the | |
972 | @uref{https://www.kernel.org,Linux kernel} website. Where the syscall | |
973 | has more specific documentation in this manual that more specific | |
974 | documentation is considered authoritative. | |
975 | ||
976 | Additional details on the Linux system call interface can be found in | |
977 | @xref{System Calls}. | |
978 | ||
979 | @node Using the Library, Roadmap to the Manual, Standards and Portability, Introduction | |
980 | @section Using the Library | |
981 | ||
982 | This section describes some of the practical issues involved in using | |
983 | @theglibc{}. | |
984 | ||
985 | @menu | |
986 | * Header Files:: How to include the header files in your | |
987 | programs. | |
988 | * Macro Definitions:: Some functions in the library may really | |
989 | be implemented as macros. | |
990 | * Reserved Names:: The C standard reserves some names for | |
991 | the library, and some for users. | |
992 | * Feature Test Macros:: How to control what names are defined. | |
993 | @end menu | |
994 | ||
995 | @node Header Files, Macro Definitions, , Using the Library | |
996 | @subsection Header Files | |
997 | @cindex header files | |
998 | ||
999 | Libraries for use by C programs really consist of two parts: @dfn{header | |
1000 | files} that define types and macros and declare variables and | |
1001 | functions; and the actual library or @dfn{archive} that contains the | |
1002 | definitions of the variables and functions. | |
1003 | ||
1004 | (Recall that in C, a @dfn{declaration} merely provides information that | |
1005 | a function or variable exists and gives its type. For a function | |
1006 | declaration, information about the types of its arguments might be | |
1007 | provided as well. The purpose of declarations is to allow the compiler | |
1008 | to correctly process references to the declared variables and functions. | |
1009 | A @dfn{definition}, on the other hand, actually allocates storage for a | |
1010 | variable or says what a function does.) | |
1011 | @cindex definition (compared to declaration) | |
1012 | @cindex declaration (compared to definition) | |
1013 | ||
1014 | In order to use the facilities in @theglibc{}, you should be sure | |
1015 | that your program source files include the appropriate header files. | |
1016 | This is so that the compiler has declarations of these facilities | |
1017 | available and can correctly process references to them. Once your | |
1018 | program has been compiled, the linker resolves these references to | |
1019 | the actual definitions provided in the archive file. | |
1020 | ||
1021 | Header files are included into a program source file by the | |
1022 | @samp{#include} preprocessor directive. The C language supports two | |
1023 | forms of this directive; the first, | |
1024 | ||
1025 | @smallexample | |
1026 | #include "@var{header}" | |
1027 | @end smallexample | |
1028 | ||
1029 | @noindent | |
1030 | is typically used to include a header file @var{header} that you write | |
1031 | yourself; this would contain definitions and declarations describing the | |
1032 | interfaces between the different parts of your particular application. | |
1033 | By contrast, | |
1034 | ||
1035 | @smallexample | |
1036 | #include <file.h> | |
1037 | @end smallexample | |
1038 | ||
1039 | @noindent | |
1040 | is typically used to include a header file @file{file.h} that contains | |
1041 | definitions and declarations for a standard library. This file would | |
1042 | normally be installed in a standard place by your system administrator. | |
1043 | You should use this second form for the C library header files. | |
1044 | ||
1045 | Typically, @samp{#include} directives are placed at the top of the C | |
1046 | source file, before any other code. If you begin your source files with | |
1047 | some comments explaining what the code in the file does (a good idea), | |
1048 | put the @samp{#include} directives immediately afterwards, following the | |
1049 | feature test macro definition (@pxref{Feature Test Macros}). | |
1050 | ||
1051 | For more information about the use of header files and @samp{#include} | |
1052 | directives, @pxref{Header Files,,, cpp.info, The GNU C Preprocessor | |
1053 | Manual}. | |
1054 | ||
1055 | @Theglibc{} provides several header files, each of which contains | |
1056 | the type and macro definitions and variable and function declarations | |
1057 | for a group of related facilities. This means that your programs may | |
1058 | need to include several header files, depending on exactly which | |
1059 | facilities you are using. | |
1060 | ||
1061 | Some library header files include other library header files | |
1062 | automatically. However, as a matter of programming style, you should | |
1063 | not rely on this; it is better to explicitly include all the header | |
1064 | files required for the library facilities you are using. The @glibcadj{} | |
1065 | header files have been written in such a way that it doesn't | |
1066 | matter if a header file is accidentally included more than once; | |
1067 | including a header file a second time has no effect. Likewise, if your | |
1068 | program needs to include multiple header files, the order in which they | |
1069 | are included doesn't matter. | |
1070 | ||
1071 | @strong{Compatibility Note:} Inclusion of standard header files in any | |
1072 | order and any number of times works in any @w{ISO C} implementation. | |
1073 | However, this has traditionally not been the case in many older C | |
1074 | implementations. | |
1075 | ||
1076 | Strictly speaking, you don't @emph{have to} include a header file to use | |
1077 | a function it declares; you could declare the function explicitly | |
1078 | yourself, according to the specifications in this manual. But it is | |
1079 | usually better to include the header file because it may define types | |
1080 | and macros that are not otherwise available and because it may define | |
1081 | more efficient macro replacements for some functions. It is also a sure | |
1082 | way to have the correct declaration. | |
1083 | ||
1084 | @node Macro Definitions, Reserved Names, Header Files, Using the Library | |
1085 | @subsection Macro Definitions of Functions | |
1086 | @cindex shadowing functions with macros | |
1087 | @cindex removing macros that shadow functions | |
1088 | @cindex undefining macros that shadow functions | |
1089 | ||
1090 | If we describe something as a function in this manual, it may have a | |
1091 | macro definition as well. This normally has no effect on how your | |
1092 | program runs---the macro definition does the same thing as the function | |
1093 | would. In particular, macro equivalents for library functions evaluate | |
1094 | arguments exactly once, in the same way that a function call would. The | |
1095 | main reason for these macro definitions is that sometimes they can | |
1096 | produce an inline expansion that is considerably faster than an actual | |
1097 | function call. | |
1098 | ||
1099 | Taking the address of a library function works even if it is also | |
1100 | defined as a macro. This is because, in this context, the name of the | |
1101 | function isn't followed by the left parenthesis that is syntactically | |
1102 | necessary to recognize a macro call. | |
1103 | ||
1104 | You might occasionally want to avoid using the macro definition of a | |
1105 | function---perhaps to make your program easier to debug. There are | |
1106 | two ways you can do this: | |
1107 | ||
1108 | @itemize @bullet | |
1109 | @item | |
1110 | You can avoid a macro definition in a specific use by enclosing the name | |
1111 | of the function in parentheses. This works because the name of the | |
1112 | function doesn't appear in a syntactic context where it is recognizable | |
1113 | as a macro call. | |
1114 | ||
1115 | @item | |
1116 | You can suppress any macro definition for a whole source file by using | |
1117 | the @samp{#undef} preprocessor directive, unless otherwise stated | |
1118 | explicitly in the description of that facility. | |
1119 | @end itemize | |
1120 | ||
1121 | For example, suppose the header file @file{stdlib.h} declares a function | |
1122 | named @code{abs} with | |
1123 | ||
1124 | @smallexample | |
1125 | extern int abs (int); | |
1126 | @end smallexample | |
1127 | ||
1128 | @noindent | |
1129 | and also provides a macro definition for @code{abs}. Then, in: | |
1130 | ||
1131 | @smallexample | |
1132 | #include <stdlib.h> | |
1133 | int f (int *i) @{ return abs (++*i); @} | |
1134 | @end smallexample | |
1135 | ||
1136 | @noindent | |
1137 | the reference to @code{abs} might refer to either a macro or a function. | |
1138 | On the other hand, in each of the following examples the reference is | |
1139 | to a function and not a macro. | |
1140 | ||
1141 | @smallexample | |
1142 | #include <stdlib.h> | |
1143 | int g (int *i) @{ return (abs) (++*i); @} | |
1144 | ||
1145 | #undef abs | |
1146 | int h (int *i) @{ return abs (++*i); @} | |
1147 | @end smallexample | |
1148 | ||
1149 | Since macro definitions that double for a function behave in | |
1150 | exactly the same way as the actual function version, there is usually no | |
1151 | need for any of these methods. In fact, removing macro definitions usually | |
1152 | just makes your program slower. | |
1153 | ||
1154 | ||
1155 | @node Reserved Names, Feature Test Macros, Macro Definitions, Using the Library | |
1156 | @subsection Reserved Names | |
1157 | @cindex reserved names | |
1158 | @cindex name space | |
1159 | ||
1160 | The names of all library types, macros, variables and functions that | |
1161 | come from the @w{ISO C} standard are reserved unconditionally; your program | |
1162 | @strong{may not} redefine these names. All other library names are | |
1163 | reserved if your program explicitly includes the header file that | |
1164 | defines or declares them. There are several reasons for these | |
1165 | restrictions: | |
1166 | ||
1167 | @itemize @bullet | |
1168 | @item | |
1169 | Other people reading your code could get very confused if you were using | |
1170 | a function named @code{exit} to do something completely different from | |
1171 | what the standard @code{exit} function does, for example. Preventing | |
1172 | this situation helps to make your programs easier to understand and | |
1173 | contributes to modularity and maintainability. | |
1174 | ||
1175 | @item | |
1176 | It avoids the possibility of a user accidentally redefining a library | |
1177 | function that is called by other library functions. If redefinition | |
1178 | were allowed, those other functions would not work properly. | |
1179 | ||
1180 | @item | |
1181 | It allows the compiler to do whatever special optimizations it pleases | |
1182 | on calls to these functions, without the possibility that they may have | |
1183 | been redefined by the user. Some library facilities, such as those for | |
1184 | dealing with variadic arguments (@pxref{Variadic Functions}) | |
1185 | and non-local exits (@pxref{Non-Local Exits}), actually require a | |
1186 | considerable amount of cooperation on the part of the C compiler, and | |
1187 | with respect to the implementation, it might be easier for the compiler | |
1188 | to treat these as built-in parts of the language. | |
1189 | @end itemize | |
1190 | ||
1191 | In addition to the names documented in this manual, reserved names | |
1192 | include all external identifiers (global functions and variables) that | |
1193 | begin with an underscore (@samp{_}) and all identifiers regardless of | |
1194 | use that begin with either two underscores or an underscore followed by | |
1195 | a capital letter are reserved names. This is so that the library and | |
1196 | header files can define functions, variables, and macros for internal | |
1197 | purposes without risk of conflict with names in user programs. | |
1198 | ||
1199 | Some additional classes of identifier names are reserved for future | |
1200 | extensions to the C language or the POSIX.1 environment. While using these | |
1201 | names for your own purposes right now might not cause a problem, they do | |
1202 | raise the possibility of conflict with future versions of the C | |
1203 | or POSIX standards, so you should avoid these names. | |
1204 | ||
1205 | @itemize @bullet | |
1206 | @item | |
1207 | Names beginning with a capital @samp{E} followed a digit or uppercase | |
1208 | letter may be used for additional error code names. @xref{Error | |
1209 | Reporting}. | |
1210 | ||
1211 | @item | |
1212 | Names that begin with either @samp{is} or @samp{to} followed by a | |
1213 | lowercase letter may be used for additional character testing and | |
1214 | conversion functions. @xref{Character Handling}. | |
1215 | ||
1216 | @item | |
1217 | Names that begin with @samp{LC_} followed by an uppercase letter may be | |
1218 | used for additional macros specifying locale attributes. | |
1219 | @xref{Locales}. | |
1220 | ||
1221 | @item | |
1222 | Names of all existing mathematics functions (@pxref{Mathematics}) | |
1223 | suffixed with @samp{f} or @samp{l} are reserved for corresponding | |
1224 | functions that operate on @code{float} and @code{long double} arguments, | |
1225 | respectively. | |
1226 | ||
1227 | @item | |
1228 | Names that begin with @samp{SIG} followed by an uppercase letter are | |
1229 | reserved for additional signal names. @xref{Standard Signals}. | |
1230 | ||
1231 | @item | |
1232 | Names that begin with @samp{SIG_} followed by an uppercase letter are | |
1233 | reserved for additional signal actions. @xref{Basic Signal Handling}. | |
1234 | ||
1235 | @item | |
1236 | Names beginning with @samp{str}, @samp{mem}, or @samp{wcs} followed by a | |
1237 | lowercase letter are reserved for additional string and array functions. | |
1238 | @xref{String and Array Utilities}. | |
1239 | ||
1240 | @item | |
1241 | Names that end with @samp{_t} are reserved for additional type names. | |
1242 | @end itemize | |
1243 | ||
1244 | In addition, some individual header files reserve names beyond | |
1245 | those that they actually define. You only need to worry about these | |
1246 | restrictions if your program includes that particular header file. | |
1247 | ||
1248 | @itemize @bullet | |
1249 | @item | |
1250 | The header file @file{dirent.h} reserves names prefixed with | |
1251 | @samp{d_}. | |
1252 | @pindex dirent.h | |
1253 | ||
1254 | @item | |
1255 | The header file @file{fcntl.h} reserves names prefixed with | |
1256 | @samp{l_}, @samp{F_}, @samp{O_}, and @samp{S_}. | |
1257 | @pindex fcntl.h | |
1258 | ||
1259 | @item | |
1260 | The header file @file{grp.h} reserves names prefixed with @samp{gr_}. | |
1261 | @pindex grp.h | |
1262 | ||
1263 | @item | |
1264 | The header file @file{limits.h} reserves names suffixed with @samp{_MAX}. | |
1265 | @pindex limits.h | |
1266 | ||
1267 | @item | |
1268 | The header file @file{pwd.h} reserves names prefixed with @samp{pw_}. | |
1269 | @pindex pwd.h | |
1270 | ||
1271 | @item | |
1272 | The header file @file{signal.h} reserves names prefixed with @samp{sa_} | |
1273 | and @samp{SA_}. | |
1274 | @pindex signal.h | |
1275 | ||
1276 | @item | |
1277 | The header file @file{sys/stat.h} reserves names prefixed with @samp{st_} | |
1278 | and @samp{S_}. | |
1279 | @pindex sys/stat.h | |
1280 | ||
1281 | @item | |
1282 | The header file @file{sys/times.h} reserves names prefixed with @samp{tms_}. | |
1283 | @pindex sys/times.h | |
1284 | ||
1285 | @item | |
1286 | The header file @file{termios.h} reserves names prefixed with @samp{c_}, | |
1287 | @samp{V}, @samp{I}, @samp{O}, and @samp{TC}; and names prefixed with | |
1288 | @samp{B} followed by a digit. | |
1289 | @pindex termios.h | |
1290 | @end itemize | |
1291 | ||
1292 | @comment Include the section on Creature Nest Macros. | |
1293 | @include creature.texi | |
1294 | ||
1295 | @node Roadmap to the Manual, , Using the Library, Introduction | |
1296 | @section Roadmap to the Manual | |
1297 | ||
1298 | Here is an overview of the contents of the remaining chapters of | |
1299 | this manual. | |
1300 | ||
1301 | @c The chapter overview ordering is: | |
1302 | @c Error Reporting (2) | |
1303 | @c Virtual Memory Allocation and Paging (3) | |
1304 | @c Character Handling (4) | |
1305 | @c Strings and Array Utilities (5) | |
1306 | @c Character Set Handling (6) | |
1307 | @c Locales and Internationalization (7) | |
1308 | @c Searching and Sorting (9) | |
1309 | @c Pattern Matching (10) | |
1310 | @c Input/Output Overview (11) | |
1311 | @c Input/Output on Streams (12) | |
1312 | @c Low-level Input/Output (13) | |
1313 | @c File System Interface (14) | |
1314 | @c Pipes and FIFOs (15) | |
1315 | @c Sockets (16) | |
1316 | @c Low-Level Terminal Interface (17) | |
1317 | @c Syslog (18) | |
1318 | @c Mathematics (19) | |
1319 | @c Arithmetic Functions (20) | |
1320 | @c Date and Time (21) | |
1321 | @c Non-Local Exist (23) | |
1322 | @c Signal Handling (24) | |
1323 | @c The Basic Program/System Interface (25) | |
1324 | @c Processes (26) | |
1325 | @c Job Control (28) | |
1326 | @c System Databases and Name Service Switch (29) | |
1327 | @c Users and Groups (30) -- References `User Database' and `Group Database' | |
1328 | @c System Management (31) | |
1329 | @c System Configuration Parameters (32) | |
1330 | @c C Language Facilities in the Library (AA) | |
1331 | @c Summary of Library Facilities (AB) | |
1332 | @c Installing (AC) | |
1333 | @c Library Maintenance (AD) | |
1334 | ||
1335 | @c The following chapters need overview text to be added: | |
1336 | @c Message Translation (8) | |
1337 | @c Resource Usage And Limitations (22) | |
1338 | @c Inter-Process Communication (27) | |
1339 | @c Debugging support (34) | |
1340 | @c POSIX Threads (35) | |
1341 | @c Internal Probes (36) | |
1342 | @c Platform-specific facilities (AE) | |
1343 | @c Contributors to (AF) | |
1344 | @c Free Software Needs Free Documentation (AG) | |
1345 | @c GNU Lesser General Public License (AH) | |
1346 | @c GNU Free Documentation License (AI) | |
1347 | ||
1348 | @itemize @bullet | |
1349 | @item | |
1350 | @ref{Error Reporting}, describes how errors detected by the library | |
1351 | are reported. | |
1352 | ||
1353 | ||
1354 | @item | |
1355 | @ref{Memory}, describes @theglibc{}'s facilities for managing and | |
1356 | using virtual and real memory, including dynamic allocation of virtual | |
1357 | memory. If you do not know in advance how much memory your program | |
1358 | needs, you can allocate it dynamically instead, and manipulate it via | |
1359 | pointers. | |
1360 | ||
1361 | @item | |
1362 | @ref{Character Handling}, contains information about character | |
1363 | classification functions (such as @code{isspace}) and functions for | |
1364 | performing case conversion. | |
1365 | ||
1366 | @item | |
1367 | @ref{String and Array Utilities}, has descriptions of functions for | |
1368 | manipulating strings (null-terminated character arrays) and general | |
1369 | byte arrays, including operations such as copying and comparison. | |
1370 | ||
1371 | @item | |
1372 | @ref{Character Set Handling}, contains information about manipulating | |
1373 | characters and strings using character sets larger than will fit in | |
1374 | the usual @code{char} data type. | |
1375 | ||
1376 | @item | |
1377 | @ref{Locales}, describes how selecting a particular country | |
1378 | or language affects the behavior of the library. For example, the locale | |
1379 | affects collation sequences for strings and how monetary values are | |
1380 | formatted. | |
1381 | ||
1382 | @item | |
1383 | @ref{Searching and Sorting}, contains information about functions | |
1384 | for searching and sorting arrays. You can use these functions on any | |
1385 | kind of array by providing an appropriate comparison function. | |
1386 | ||
1387 | @item | |
1388 | @ref{Pattern Matching}, presents functions for matching regular expressions | |
1389 | and shell file name patterns, and for expanding words as the shell does. | |
1390 | ||
1391 | @item | |
1392 | @ref{I/O Overview}, gives an overall look at the input and output | |
1393 | facilities in the library, and contains information about basic concepts | |
1394 | such as file names. | |
1395 | ||
1396 | @item | |
1397 | @ref{I/O on Streams}, describes I/O operations involving streams (or | |
1398 | @w{@code{FILE *}} objects). These are the normal C library functions | |
1399 | from @file{stdio.h}. | |
1400 | ||
1401 | @item | |
1402 | @ref{Low-Level I/O}, contains information about I/O operations | |
1403 | on file descriptors. File descriptors are a lower-level mechanism | |
1404 | specific to the Unix family of operating systems. | |
1405 | ||
1406 | @item | |
1407 | @ref{File System Interface}, has descriptions of operations on entire | |
1408 | files, such as functions for deleting and renaming them and for creating | |
1409 | new directories. This chapter also contains information about how you | |
1410 | can access the attributes of a file, such as its owner and file protection | |
1411 | modes. | |
1412 | ||
1413 | @item | |
1414 | @ref{Pipes and FIFOs}, contains information about simple interprocess | |
1415 | communication mechanisms. Pipes allow communication between two related | |
1416 | processes (such as between a parent and child), while FIFOs allow | |
1417 | communication between processes sharing a common file system on the same | |
1418 | machine. | |
1419 | ||
1420 | @item | |
1421 | @ref{Sockets}, describes a more complicated interprocess communication | |
1422 | mechanism that allows processes running on different machines to | |
1423 | communicate over a network. This chapter also contains information about | |
1424 | Internet host addressing and how to use the system network databases. | |
1425 | ||
1426 | @item | |
1427 | @ref{Low-Level Terminal Interface}, describes how you can change the | |
1428 | attributes of a terminal device. If you want to disable echo of | |
1429 | characters typed by the user, for example, read this chapter. | |
1430 | ||
1431 | @item | |
1432 | @ref{Mathematics}, contains information about the math library | |
1433 | functions. These include things like random-number generators and | |
1434 | remainder functions on integers as well as the usual trigonometric and | |
1435 | exponential functions on floating-point numbers. | |
1436 | ||
1437 | @item | |
1438 | @ref{Arithmetic,, Low-Level Arithmetic Functions}, describes functions | |
1439 | for simple arithmetic, analysis of floating-point values, and reading | |
1440 | numbers from strings. | |
1441 | ||
1442 | @item | |
1443 | @ref{Date and Time}, describes functions for measuring both calendar time | |
1444 | and CPU time, as well as functions for setting alarms and timers. | |
1445 | ||
1446 | @item | |
1447 | @ref{Non-Local Exits}, contains descriptions of the @code{setjmp} and | |
1448 | @code{longjmp} functions. These functions provide a facility for | |
1449 | @code{goto}-like jumps which can jump from one function to another. | |
1450 | ||
1451 | @item | |
1452 | @ref{Signal Handling}, tells you all about signals---what they are, | |
1453 | how to establish a handler that is called when a particular kind of | |
1454 | signal is delivered, and how to prevent signals from arriving during | |
1455 | critical sections of your program. | |
1456 | ||
1457 | @item | |
1458 | @ref{Program Basics}, tells how your programs can access their | |
1459 | command-line arguments and environment variables. | |
1460 | ||
1461 | @item | |
1462 | @ref{Processes}, contains information about how to start new processes | |
1463 | and run programs. | |
1464 | ||
1465 | @item | |
1466 | @ref{Job Control}, describes functions for manipulating process groups | |
1467 | and the controlling terminal. This material is probably only of | |
1468 | interest if you are writing a shell or other program which handles job | |
1469 | control specially. | |
1470 | ||
1471 | @item | |
1472 | @ref{Name Service Switch}, describes the services which are available | |
1473 | for looking up names in the system databases, how to determine which | |
1474 | service is used for which database, and how these services are | |
1475 | implemented so that contributors can design their own services. | |
1476 | ||
1477 | @item | |
1478 | @ref{User Database}, and @ref{Group Database}, tell you how to access | |
1479 | the system user and group databases. | |
1480 | ||
1481 | @item | |
1482 | @ref{System Management}, describes functions for controlling and getting | |
1483 | information about the hardware and software configuration your program | |
1484 | is executing under. | |
1485 | ||
1486 | @item | |
1487 | @ref{System Configuration}, tells you how you can get information about | |
1488 | various operating system limits. Most of these parameters are provided for | |
1489 | compatibility with POSIX. | |
1490 | ||
1491 | @item | |
1492 | @ref{Language Features}, contains information about library support for | |
1493 | standard parts of the C language, including things like the @code{sizeof} | |
1494 | operator and the symbolic constant @code{NULL}, how to write functions | |
1495 | accepting variable numbers of arguments, and constants describing the | |
1496 | ranges and other properties of the numerical types. There is also a simple | |
1497 | debugging mechanism which allows you to put assertions in your code, and | |
1498 | have diagnostic messages printed if the tests fail. | |
1499 | ||
1500 | @item | |
1501 | @ref{Library Summary}, gives a summary of all the functions, variables, and | |
1502 | macros in the library, with complete data types and function prototypes, | |
1503 | and says what standard or system each is derived from. | |
1504 | ||
1505 | @item | |
1506 | @ref{Installation}, explains how to build and install @theglibc{} on | |
1507 | your system, and how to report any bugs you might find. | |
1508 | ||
1509 | @item | |
1510 | @ref{Maintenance}, explains how to add new functions or port the | |
1511 | library to a new system. | |
1512 | @end itemize | |
1513 | ||
1514 | If you already know the name of the facility you are interested in, you | |
1515 | can look it up in @ref{Library Summary}. This gives you a summary of | |
1516 | its syntax and a pointer to where you can find a more detailed | |
1517 | description. This appendix is particularly useful if you just want to | |
1518 | verify the order and type of arguments to a function, for example. It | |
1519 | also tells you what standard or system each function, variable, or macro | |
1520 | is derived from. |