Differences between revisions 23 and 24
Revision 23 as of 2014-11-28 15:41:06
Size: 12109
Comment:
Revision 24 as of 2014-11-28 16:15:08
Size: 12284
Comment:
Deletions are marked like this. Additions are marked like this.
Line 56: Line 56:
 * Size of buffer reads in stream implementation. When using NFS and very very large block sizes, say 1MB, the glibc stream implementation will buffer using those block sizes and this leads to huge latencies in buffer fills. It would be better to be able to tune this manually per stream.  * Size of buffer reads in stream implementation. When using NFS and very very large block sizes, say 1MB, the glibc stream implementation will buffer using those block sizes and this leads to huge latencies in buffer fills. It would be better to be able to tune this manually per stream. Perhaps the best option is to have a "max buffer size" tunnable, that is queried when creating the stream and used as the upper limit regardless of the filesystem block size.

Tuning Library Runtime Behavior

WORK IN PROGRESS

The following material is a work in progress and should not be considered complete or ready for public use.

1. Why?

No set of library defaults is appropriate for all workloads.

The GNU C Library makes assumptions on behalf of the user and provides a specific runtime behaviour that may not match the user workload or requirements.

For example the NPTL implementation sets a fixed cache size of 40MB for the re-use of thread stacks. Is it possible that this is correct under all workloads? Average workloads? This default was set 10 years ago and has not been revisited.

I propose we expose some of the library internals as tunable runtime parameters that our users and developers can use to tune the library. Developers would use them to achieve optimal mean performance for all users, while a single advanced user might use it to get the best performance from their application.

To reiterate:

  • Advanced users can do their own performance measurements and work with the community to discuss what works and doesn't work on certain workloads or hardware configurations.
  • Developers can use the knobs to test ideas, or experiment with dynamic tuning and ensure that average case performance of the default parameters works for a broad audience.
  • Normal users accept the defaults and those defaults work well.

We have immediate short-term needs today to expose library internals as tunable parameters, in particular:

  • When and if to use PI-aware locks for the library internals.
  • Default thread stack sizes.
  • Lock elision parameters for performance testing.
  • Size of thread stack cache, both maximums, minimums, and defaults.
  • XDR max request size. Limited to 1024 bytes for legacy servers, but Linux imposes no such limit. You could have a huge group map and it should work. Unfortunately large XDR requests can consume large amounts of memory on the server, so it's up to the admin to select a reasonable value. The library can enforce a maximum, but eventually that will be not enough for certain uses.
  • Memory allocator, malloc() et. al., beahviour.
  • Dynamic loader behaviour.
  • NSCD group order behaviour e.g. default gid listed first like other UNIX? Fastest order? Sorted order? (https://bugzilla.redhat.com/show_bug.cgi?id=959980)

  • User selectable amount of static TLS to reserve for dlopen'd modules that could then use this static TLS for optimal access (http://sourceware.org/ml/libc-alpha/2013-05/msg01088.html)

  • User selectable buffering schemes for stdio (http://sourceware.org/bugzilla/show_bug.cgi?id=4099).

  • Initial size of group list for initgroups.
  • Disable RFC 3484 IPv4 address sorting for legacy applications.
  • Size of buffer reads in stream implementation. When using NFS and very very large block sizes, say 1MB, the glibc stream implementation will buffer using those block sizes and this leads to huge latencies in buffer fills. It would be better to be able to tune this manually per stream. Perhaps the best option is to have a "max buffer size" tunnable, that is queried when creating the stream and used as the upper limit regardless of the filesystem block size.

2. How?

  • Tunables are a tradeoff.
    • If it is clear which choice is best, adding a tunable is a mistake.
  • Tunables never make the implementation non-conforming
    • Variables or other tunables should merely transform the library from one conforming implementation to a different conforming implementation. No settings should make it non-conforming.
  • Tunables whose non-default values could break an application expecting the default values should be ignored for AT_SECURE.
    • Any settings which could cause a conforming application which works correctly with the default settings to stop working correctly should be ignored completely when the program is suid or AT_SECURE is set in the aux vector.
  • Tunable namespace should be clearly defined
    • The namespace for glibc tuning variables should be clearly defined in such a way that they can be mechanically removed from the environment without having to worry that future additions will be missed by the stripping code.
  • Tunables never change semantics.
    • Changing a tunable must never cause the semantics of any library interface to violate the standard the library implements. The tunable adjusts internal implementation details all within the guiding envelope of the standard that defines the function. The tunable might lessen the promise of a function but only if that lessening is still within the bounds of the standard.

  • Tunables are thread safe.
    • Setting the tunables shall be thread safe.
  • Declare the tunables stable only in a given release e.g. 2.17.
    • The tunables expose internal implementation details of the library and should not be considered a stable ABI. The library must be able to evolve internal implementation from release to release.
  • Define tunable settings in terms of a "context."
    • Each change to a tunable matters only in the context of the tunables use. For example the global context would set a tunable for any use of that tunable globally for the process. For example a function-level context might set a default for all functions called from the current function e.g. lock elision.
  • Allow the use of environment variables to set tunables.
    • Easy for programmer experimentation. Shall be thread safe. Read only once at process startup. Changing any of the env vars that control runtime tuning will have no effect on the currently executing process. An application with AT_SECURE set will ignore all environment variable tunables and will not pass them automatically to their children (that doesn't preclude the AT_SECURE application setting an env var for the child or using the API to tune performance for itself).
  • Create a stable API for manipulating tunable runtime parameters.
    • Easy for automation. The API must provide a way for tunables to be reset to default values (used before forking a new process or execing).
  • Provide a shared-memory API for tuning.
    • Allows for performance experiments and the developing of auto-tuning algorithms on live running programs.
  • Debugging
    • Provide a way to dump all of the tunables for debugging. Provide a way to easily inspect all the tunable values from a debugger, or reset all tunables directly from the debugger e.g. inferior function call.

3. Desgin examples

3.1. Example: Some properties read at startup others continually via a global pointer

The only feasible design today is to create a global pointer that points to a structure that contains all tunnables for the entire library. At startup certain values of this structure are used for IFUNC selection and to initialize library-wide values that need early initialization. Later some values which can be dynamically changed may also be read via this global pointer e.g. default thread stack size. We document each property and if it's applied at startup, or if it is read at ever use. Startup properties could only be set via env vars or an admin sysconfig file read at startup.

3.2. Example: Fully dynamic properties via a global pointer

This is only a toy example of how one might use a global pointer, and a lockless algorithm, to push and pop tunable contexts for the entire library to use. The entire library would need to reference tunables via some levels of indirection through the global pointer (previously just referenced the global pointer).

For example:

/* A definition of a tunable is a name/value tuple (for now).  */
struct __tunable {
  char *tunable;
  char *value;
};
typedef struct __tunable tunable;

/* The tunables have IDs that we use to index into the tunable table
   for each context.  */
enum {
  GNU_LIBC_PTHREAD_DEFAULT_STACKSIZE = 0,
  GNU_LIBC_PTHREAD_STACK_CACHESIZE = 1,
  ...
  GNU_LIBC_MAX_TUNABLE = 100
};

/* A context contains a set of tunables.  */
struct __tunable_context {
  char *id;
  tunable tlist[GNU_LIBC_MAX_TUNABLE];
  tunable_context *previous; 
};
typedef struct __tunable_context tunable_context;

/* Hidden pointer to active context in the library.  */
tunable_context *__default_tunable_context attribute_hidden;

/* Create a context from the current active context and call it ID.  */
tunable_context *create_tunable_context_np (const char *id);
int destroy_tunable_context_np (tunable_context *context);

/* Set a tunable for a context.  */
int set_tunable_np (tunable_context *context, const char *tunable, const char *value);
const char *get_tunable_np (tunable_context *context, const char *tunable);

/* Push or pop a context. Overrides the previous context.  */
int push_tunable_context_np (tunable_context *context);
tunable_context *pop_tunable_context_np (void);

/* Get the list of all tunables currently available.  */
int list_tunables_np (char **tunables, int *size);

e.g.

tunable_context *ctx = create_tunable_context_np ();
if (set_tunable_np (ctx, "GNU_LIBC_PTHREAD_DEFAULT_STACKSIZE", "1048576") != 0)
  {
    /* Error handling.  */
  }
if (push_tunable_context_np (ctx) != 0)
  {
    /* Error handling.  */
  }
/* Do work with context active.  */
if (pop_tunable_context_np () == NULL)
  {
    /* Error handling.  */
  }
/* Restores previous context.  */

Per-process as an env var:

export GNU_LIBC_$tunable=$value

Equivalent to calling the following at startup:

tunable_context *ctx = create_tunable_context_np (NULL);
set_tunable_np (ctx, "GNU_LIBC_$tunable", "$value");
push_tunable_context_np (ctx);

Per-named-context as a env-var:

export GNU_LIBC_$tunable_$id=$value

Equivalent to calling the following at startup:

tunable_context *ctx = create_tunable_context_np ("$id");
set_tunable_np (ctx, "GNU_LIBC_$tunable", "$value");
push_tunable_context_np (ctx);

Where:

  • `id' is a user chosen identifier for the context.
  • `tunable' is the serialized name for the tunable.
  • `value' is the serialized value of the tunable which will be interpreted by the tunable code as required.

Notes:

  • A shared memory interface would allow you to attach to a program and manipulate the runtime settings in realtime.

4. Next steps

4.1. Collect all globals

As recommended in Cauldron 2013 we need to bring together a global internal private structure first that contains all of the globals one might want to modify. That way we can see what is actually tunnable.

4.2. Analyze env vars currently in use

Analyzing currenct use of glibc env vars. Currently not complete. Currently contains env vars from auxiliary libraries.

  • ARGP_HELP_FMT
  • LANG
  • LD_BIND_NOW
  • LD_LIBRARY_PATH
  • LD_PRELOAD
  • LD_TRACE_LOADED_OBJECTS
  • LD_AOUT_LIBRARY_PATH
  • LD_AOUT_PRELOAD
  • LD_AUDIT
  • LD_BIND_NOT
  • LD_DEBUG
  • LD_DEBUG_OUTPUT
  • LD_DYNAMIC_WEAK
  • LD_HWCAP_MASK
  • LD_KEEPDIR
  • LD_NOWARN
  • LD_ORIGIN_PATH
  • LD_POINTER_GUARD
  • LD_PROFILE
  • LD_PROFILE_OUTPUT
  • LD_SHOW_AUXV
  • LD_USE_LOAD_BIAS
  • LD_VERBOSE
  • LD_WARN
  • LDD_ARGV0
  • MALLOC_CHECK_
  • NLSPATH
  • HZ
  • SEGFAULT_SIGNALS
  • SEGFAULT_USE_ALTSTACK
  • SEGFAULT_OUTPUT_NAME
  • PCPROFILE_OUTPUT
  • SOTRUSS_FROMLIST
  • SOTRUSS_TOLIST
  • SOTRUSS_EXIT
  • SOTRUSS_WHICH
  • SOTRUSS_OUTNAME
  • GMON_OUT_PREFIX
  • HESIOD_CONFIG s
  • HES_DOMAIN s
  • CRASHSERVER
  • COREFILE
  • GCONV_PATH
  • HOME s
  • LANGAUGE
  • OUTPUT_CHARSET
  • CHARSET
  • LOCPATH
  • LC_ALL
  • I18NPATH
  • POSIXLY_CORRECT
  • MEMUSAGE_PROG_NAME
  • MEMUSAGE_OUTPUT
  • MEMUSAGE_BUFFER_SIZE
  • MEMUSAGE_BUFFER_SIZE
  • MEMUSAGE_NO_TIMER
  • MEMUSAGE_TRACE_MMAP
  • NIS_PATH
  • NIS_DEFAULTS
  • NIS_GROUP
  • LOCALDOMAIN
  • IFS
  • TMPDIR
  • GETCONF_DIR
  • ENV_HOSTCONF
  • ENV_SPOOF
  • ENV_MULTI
  • ENV_REORDER
  • ENV_TRIM_ADD
  • ENV_TRIM_OVERR
  • HOSTALIASES
  • RES_OPTIONS
  • MSGVERB
  • SEV_LEVEL
  • NLSPROVIER
  • LIBC_FATAL_STDERR_
  • LD_ASSUME_KERNEL
  • TZ
  • TZDIR
  • DATEMSK

None: TuningLibraryRuntimeBehavior (last edited 2014-11-28 16:15:08 by CarlosODonell)