Differences between revisions 27 and 28
Revision 27 as of 2013-10-09 14:51:05
Size: 12069
Comment:
Revision 28 as of 2013-10-09 14:53:33
Size: 10927
Comment:
Deletions are marked like this. Additions are marked like this.
Line 241: Line 241:

---- /!\ '''Edit conflict - other version:''' ----
Line 245: Line 243:
---- /!\ '''Edit conflict - your version:''' ----
If you're going to check for NULL pointer arguments where you have not entered into a contract to accept and interpret them, do so with an assert, not a conditional error return. This way the bugs in the caller will be immediately detected and can be fixed, and it makes it easy to disable the overhead in production builds. The assert can be valuable as code documentation. However, a segfault from dereferencing the NULL pointer is just as effective for debugging. If you return an error code to a caller which has already proven itself buggy, the most likely result is that the caller will ignore the error, and bad things will happen much later down the line when the original cause of the error has become difficult or impossible to track down. Why is it reasonable to assume the caller will ignore the error you return? Because the caller already ignored the error return of malloc or fopen or some other library-specific allocation function which returned NULL to indicate an error.

---- /!\ '''End of edit conflict''' ----

Glibc Coding Style and Conventions

This document is not meant to be a reiteration of the GNU coding style document (see here). It is a clarification when the GLIBC policy differs or expands upon the GNU standard.

This is a work in progress and not yet definitive.


1. Code Formatting

1.1. Symbols and Parenthesis

When invoking functions make sure there is a space between the symbol and the parenthesis, e.g.,

retval = foo (bar);

This rule does not apply to preprocessor function definitions where having a space between the symbol and the parenthesis would cause improper expansion, for example, the following is correct:

#define BAT(x) do { \
  bat = FOO (x);      \
} while (0)

Whereas, the following is incorrect:

#define BAT (x) do { \
  bat = FOO (x);      \
} while (0)


1.2. 79-Column Lines

All source files in glibc must use lines of fewer than 80 characters. The only exceptions are when it's syntactically impossible to split a line for some reason.


1.3. Nested C Preprocessor Directives

Nested preprocessor directives need spaces after the '#'.

Example 1: One level of nesting

#if __FP_FAST_FMA
# define FP_FAST_FMA 1
#endif

#if __FP_FAST_FMAF
# define FP_FAST_FMAF 1
#endif

#if __FP_FAST_FMAL
# define FP_FAST_FMAL 1
#endif

Reference: http://sourceware.org/ml/libc-alpha/2010-10/msg00024.html

Example 2: Several levels of nesting

#ifdef HAVE_ASM_GLOBAL_DOT_NAME
# ifndef C_SYMBOL_DOT_NAME
#  if defined __GNUC__ && defined __GNUC_MINOR__ \
      && (__GNUC__ << 16) + __GNUC_MINOR__ >= (3 << 16) + 1
#   define C_SYMBOL_DOT_NAME(name) .name
#  else
#   define C_SYMBOL_DOT_NAME(name) .##name
#  endif
# endif
#endif

Note that in a header file, the outer #ifndef _FILE_H/#endif pair does not increase the indentation level.

Example 3: Outer #ifndef

#ifndef _FILE_H
#if FOO
# define BAR
#endif
#endif

1.4. Files not formatted according to the GNU standard

Some files (e.g. malloc/arena.c) and have a different, consistent coding style since the origin of the file inside glibc due to being imported from a different project or source. The rule for such files is to stick to the code formatting convention in that file.

Reference: http://sourceware.org/ml/libc-alpha/2012-08/msg00182.html


2. Use of GCC Compiler Attributes

2.1. inline

We should eschew the inline keyword entirely when used alone. If it really matters that something be inlined, it needs always_inline. Otherwise we should leave optimization decisions to the compiler unless there is a particular strong reason in an individual case. Any such cases should have clear comments saying why the explicit inline is desireable.

2.2. __unused__

Use __attribute__ ((__unused__)) with static inline


3. Creating files

  • Don't create empty files
  • Find new versions of the copyright headers to use as a template.
  • Make sure the top line is descriptive.
  • "Contributed by" statements are no longer used.

3.1. Proper sysdeps Location

3.1.1. Default ENOTSUP Implementation Location

3.1.2. OS Specific Implementation

sysdeps/unix/sysv/linux/<foo>.[ch]

3.1.3. OS and Platform Specific Implementation

sysdeps/unix/sysv/linux/powerpc/<foo>.[ch]

3.1.4. Wordsize Specific Implementation

sysdeps/unix/sysv/linux/powerpc/powerpc[32|64]/<foo>.[ch]

3.1.5. Platform Specific Implementation

sysdeps/powerpc/<foo>.[ch]

3.1.6. Floating-Point Unit Implementation

sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/<foo>.[ch] sysdeps/powerpc/powerpc32/fpu/<foo>.[ch]


4. Reusing Existing Code

When possible pick up existing code via #include directives rather than copying code. For whole files, this may also be done automatically via Implies files.

We strive to reduce the number of duplicate copies of code, for example by consolidating all copies of an architecture-specific sysdeps/unix/sysv/linux/<arch> header into an architecture-independent one plus a set of small architecture-specific ones for the architecture-specific bits.


5. Macros vs. Static Inlines

Static inline functions are preferred over macros, when possible, because the compiler can more adequately schedule static inlines.


6. Header Files

bits/<foo>.h not a place for an API, just for OS specific definitions.


7. Alloca vs. Malloc

Here are some things to consider when deciding whether to use alloca or malloc:

  • Do not use alloca to create an array whose size S is such that ! libc_use_alloca (S), as large arrays like that may bypass stack-overflow checking.

  • If the storage may need to outlive the current function, then obviously alloca cannot be used.

  • If the API does not allow returning a memory-allocation failure indication such as ENOMEM, then alloca may be preferable, as malloc can fail.

  • If this is a hot path with a small allocation, prefer alloca, as it is typically much faster.

  • When growing a buffer, either on the stack or on the heap, watch out for integer overflow when calculating the new size. Such overflow should be treated as allocation failure than letting the integer wrap around.
  • If the size of the buffer is directly or indirectly under user control, consider imposing a maximum to help make denial-of-service attacks more difficult.
  • If this is a hot path and the allocation size is typically small but may be large, and is known in advance, you can use the following pattern:

    bool use_alloca = __libc_use_alloca (bufsize);
    struct foo *buf = use_alloca ? alloca (bufsize) : malloc (bufsize);
    if (buf)
      do_work_with (buf, bufsize);
    if (! use_alloca)
      free (buf);
  • Use of alloca is a memory optimization compared to having a local array on stack. That is, the above example is close in behavior to the following, except that the alloca version consumes only the stack space needed, rather than always consuming approximately 4000 bytes on the stack.

    struct foo buffer[4000 / sizeof (struct foo)];
    struct foo *buf = bufsize <= sizeof buffer ? buffer : malloc (bufsize);
    if (buf)
      do_work_with (buf, bufsize);
    if (buf != buffer)
      free (buf);
  • If the amount of storage is not known in advance but may grow without bound, you can start with a small buffer on the stack and switch to malloc if it gets to be too large for the stack. While the storage is on the stack, you can grow it by using extend_alloca. For example:

    struct foo buffer[10];
    struct foo *buf = buffer;
    size_t bufsize = sizeof buffer;
    void *allocated = NULL;
    size_t needed;
    while (bufsize < (needed = do_work_with (buf, bufsize)))
      {
        if (__libc_use_alloca (needed))
          {
            size_t size = bufsize;
            void *newbuf = extend_alloca (buf, bufsize, needed);
            buf = memmove (newbuf, buf, size);
          }
        else
          {
            void *newbuf = realloc (allocated, needed);
            if (! newbuf)
              {
                needed = 0;
                break;
              }
            if (! allocated)
              memcpy (newbuf, buf, bufsize);
            buf = allocated = newbuf;
            bufsize = needed;
          }
      }
    free (allocated);
    return needed; /* This is zero on allocation failure.  */
  • To boost performance a bit in the typical case of the above examples, you can use __glibc_likely or __glibc_unlikely, e.g., if (__glibc_likely (use_alloca)) instead of just if (use_alloca).

At present there is no magic bullet of special procedure for selecting alloca vs. malloc; if there was then we could encode it into this wiki or into a macro.


8. Branch Prediction

glibc has the __glibc_likely and __glibc_unlikely macros that wrap around __builtin_expect. Use those instead of using __builtin_expect for branch prediction since they're nicer to read.

9. Invalid pointers

The GNU C library considers it a QoI feature not to mask user bugs by detecting invalid pointers and returning EINVAL (unless the API is standardized and says it does that). If passing a bad pointer has undefined behavior, it is far more useful in the long run if it crashes quickly rather than diagnosing an error that is probably ignored by the flaky caller.

9.1. NULL pointers

If you're going to check for NULL pointer arguments where you have not entered into a contract to accept and interpret them, do so with an assert, not a conditional error return. This way the bugs in the caller will be immediately detected and can be fixed, and it makes it easy to disable the overhead in production builds. The assert can be valuable as code documentation. However, a segfault from dereferencing the NULL pointer is just as effective for debugging. If you return an error code to a caller which has already proven itself buggy, the most likely result is that the caller will ignore the error, and bad things will happen much later down the line when the original cause of the error has become difficult or impossible to track down. Why is it reasonable to assume the caller will ignore the error you return? Because the caller already ignored the error return of malloc or fopen or some other library-specific allocation function which returned NULL to indicate an error.

In summary:

  • If you have no contract to accept NULL and you don't immediately dereference the pointer then use an assert to raise an error when NULL is passed as an invalid argument.
  • If you have no contract to accept NULL and immediately dereference the pointer then the segfault is sufficient to indicate the error.
  • If you have a contract to accept NULL then do so.

10. Assertions

Assertions are for internal consistency checking only.

External conditions are governed by the API and if user code violates the API then the library behaviour is undefined.

However, in scenarios where user input is recorded into internal structures for later use it is useful to assert in these cases to catch the first occurrence of the error.

None: Style_and_Conventions (last edited 2014-09-12 18:11:49 by CarlosODonell)