]> sourceware.org Git - glibc.git/blame - manual/tunables.texi
rtld: Avoid using up static TLS surplus for optimizations [BZ #25051]
[glibc.git] / manual / tunables.texi
CommitLineData
b31b4d6a
SP
1@node Tunables
2@c @node Tunables, , Internal Probes, Top
3@c %MENU% Tunable switches to alter libc internal behavior
4@chapter Tunables
5@cindex tunables
6
7@dfn{Tunables} are a feature in @theglibc{} that allows application authors and
8distribution maintainers to alter the runtime library behavior to match
9their workload. These are implemented as a set of switches that may be
10modified in different ways. The current default method to do this is via
11the @env{GLIBC_TUNABLES} environment variable by setting it to a string
12of colon-separated @var{name}=@var{value} pairs. For example, the following
13example enables malloc checking and sets the malloc trim threshold to 128
14bytes:
15
16@example
17GLIBC_TUNABLES=glibc.malloc.trim_threshold=128:glibc.malloc.check=3
18export GLIBC_TUNABLES
19@end example
20
21Tunables are not part of the @glibcadj{} stable ABI, and they are
22subject to change or removal across releases. Additionally, the method to
23modify tunable values may change between releases and across distributions.
24It is possible to implement multiple `frontends' for the tunables allowing
25distributions to choose their preferred method at build time.
26
27Finally, the set of tunables available may vary between distributions as
28the tunables feature allows distributions to add their own tunables under
29their own namespace.
30
31@menu
32* Tunable names:: The structure of a tunable name
33* Memory Allocation Tunables:: Tunables in the memory allocation subsystem
0c7b002f 34* Dynamic Linking Tunables:: Tunables in the dynamic linking subsystem
07ed18d2 35* Elision Tunables:: Tunables in elision subsystem
6310e6be 36* POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
ea9b0ecb
SP
37* Hardware Capability Tunables:: Tunables that modify the hardware
38 capabilities seen by @theglibc{}
b31b4d6a
SP
39@end menu
40
41@node Tunable names
42@section Tunable names
43@cindex Tunable names
44@cindex Tunable namespaces
45
46A tunable name is split into three components, a top namespace, a tunable
47namespace and the tunable name. The top namespace for tunables implemented in
48@theglibc{} is @code{glibc}. Distributions that choose to add custom tunables
49in their maintained versions of @theglibc{} may choose to do so under their own
50top namespace.
51
52The tunable namespace is a logical grouping of tunables in a single
53module. This currently holds no special significance, although that may
54change in the future.
55
56The tunable name is the actual name of the tunable. It is possible that
57different tunable namespaces may have tunables within them that have the
58same name, likewise for top namespaces. Hence, we only support
59identification of tunables by their full name, i.e. with the top
60namespace, tunable namespace and tunable name, separated by periods.
61
62@node Memory Allocation Tunables
63@section Memory Allocation Tunables
64@cindex memory allocation tunables
65@cindex malloc tunables
66@cindex tunables, malloc
67
68@deftp {Tunable namespace} glibc.malloc
69Memory allocation behavior can be modified by setting any of the
70following tunables in the @code{malloc} namespace:
71@end deftp
72
73@deftp Tunable glibc.malloc.check
74This tunable supersedes the @env{MALLOC_CHECK_} environment variable and is
75identical in features.
76
ec2c1fce
FW
77Setting this tunable to a non-zero value enables a special (less
78efficient) memory allocator for the malloc family of functions that is
79designed to be tolerant against simple errors such as double calls of
80free with the same argument, or overruns of a single byte (off-by-one
81bugs). Not all such errors can be protected against, however, and memory
82leaks can result. Any detected heap corruption results in immediate
83termination of the process.
b31b4d6a
SP
84
85Like @env{MALLOC_CHECK_}, @code{glibc.malloc.check} has a problem in that it
86diverges from normal program behavior by writing to @code{stderr}, which could
87by exploited in SUID and SGID binaries. Therefore, @code{glibc.malloc.check}
88is disabled by default for SUID and SGID binaries. This can be enabled again
89by the system administrator by adding a file @file{/etc/suid-debug}; the
90content of the file could be anything or even empty.
91@end deftp
92
93@deftp Tunable glibc.malloc.top_pad
94This tunable supersedes the @env{MALLOC_TOP_PAD_} environment variable and is
95identical in features.
96
97This tunable determines the amount of extra memory in bytes to obtain from the
98system when any of the arenas need to be extended. It also specifies the
99number of bytes to retain when shrinking any of the arenas. This provides the
100necessary hysteresis in heap size such that excessive amounts of system calls
101can be avoided.
102
103The default value of this tunable is @samp{0}.
104@end deftp
105
106@deftp Tunable glibc.malloc.perturb
107This tunable supersedes the @env{MALLOC_PERTURB_} environment variable and is
108identical in features.
109
110If set to a non-zero value, memory blocks are initialized with values depending
111on some low order bits of this tunable when they are allocated (except when
112allocated by calloc) and freed. This can be used to debug the use of
113uninitialized or freed heap memory. Note that this option does not guarantee
114that the freed block will have any specific values. It only guarantees that the
115content the block had before it was freed will be overwritten.
116
117The default value of this tunable is @samp{0}.
118@end deftp
119
120@deftp Tunable glibc.malloc.mmap_threshold
121This tunable supersedes the @env{MALLOC_MMAP_THRESHOLD_} environment variable
122and is identical in features.
123
124When this tunable is set, all chunks larger than this value in bytes are
125allocated outside the normal heap, using the @code{mmap} system call. This way
126it is guaranteed that the memory for these chunks can be returned to the system
127on @code{free}. Note that requests smaller than this threshold might still be
128allocated via @code{mmap}.
129
130If this tunable is not set, the default value is set to @samp{131072} bytes and
131the threshold is adjusted dynamically to suit the allocation patterns of the
132program. If the tunable is set, the dynamic adjustment is disabled and the
133value is set as static.
134@end deftp
135
136@deftp Tunable glibc.malloc.trim_threshold
137This tunable supersedes the @env{MALLOC_TRIM_THRESHOLD_} environment variable
138and is identical in features.
139
140The value of this tunable is the minimum size (in bytes) of the top-most,
141releasable chunk in an arena that will trigger a system call in order to return
142memory to the system from that arena.
143
144If this tunable is not set, the default value is set as 128 KB and the
145threshold is adjusted dynamically to suit the allocation patterns of the
146program. If the tunable is set, the dynamic adjustment is disabled and the
147value is set as static.
148@end deftp
149
150@deftp Tunable glibc.malloc.mmap_max
151This tunable supersedes the @env{MALLOC_MMAP_MAX_} environment variable and is
152identical in features.
153
154The value of this tunable is maximum number of chunks to allocate with
155@code{mmap}. Setting this to zero disables all use of @code{mmap}.
156
157The default value of this tunable is @samp{65536}.
158@end deftp
159
160@deftp Tunable glibc.malloc.arena_test
161This tunable supersedes the @env{MALLOC_ARENA_TEST} environment variable and is
162identical in features.
163
164The @code{glibc.malloc.arena_test} tunable specifies the number of arenas that
165can be created before the test on the limit to the number of arenas is
166conducted. The value is ignored if @code{glibc.malloc.arena_max} is set.
167
168The default value of this tunable is 2 for 32-bit systems and 8 for 64-bit
169systems.
170@end deftp
171
172@deftp Tunable glibc.malloc.arena_max
173This tunable supersedes the @env{MALLOC_ARENA_MAX} environment variable and is
174identical in features.
175
176This tunable sets the number of arenas to use in a process regardless of the
177number of cores in the system.
178
179The default value of this tunable is @code{0}, meaning that the limit on the
180number of arenas is determined by the number of CPU cores online. For 32-bit
181systems the limit is twice the number of cores online and on 64-bit systems, it
182is 8 times the number of cores online.
183@end deftp
ea9b0ecb 184
d5c3fafc
DD
185@deftp Tunable glibc.malloc.tcache_max
186The maximum size of a request (in bytes) which may be met via the
187per-thread cache. The default (and maximum) value is 1032 bytes on
18864-bit systems and 516 bytes on 32-bit systems.
189@end deftp
190
191@deftp Tunable glibc.malloc.tcache_count
192The maximum number of chunks of each size to cache. The default is 7.
1f50f2ad 193The upper limit is 65535. If set to zero, the per-thread cache is effectively
5ad533e8 194disabled.
d5c3fafc
DD
195
196The approximate maximum overhead of the per-thread cache is thus equal
197to the number of bins times the chunk count in each bin times the size
198of each chunk. With defaults, the approximate maximum overhead of the
199per-thread cache is approximately 236 KB on 64-bit systems and 118 KB
200on 32-bit systems.
201@end deftp
202
203@deftp Tunable glibc.malloc.tcache_unsorted_limit
204When the user requests memory and the request cannot be met via the
205per-thread cache, the arenas are used to meet the request. At this
206time, additional chunks will be moved from existing arena lists to
207pre-fill the corresponding cache. While copies from the fastbins,
208smallbins, and regular bins are bounded and predictable due to the bin
209sizes, copies from the unsorted bin are not bounded, and incur
210additional time penalties as they need to be sorted as they're
211scanned. To make scanning the unsorted list more predictable and
212bounded, the user may set this tunable to limit the number of chunks
213that are scanned from the unsorted list while searching for chunks to
214pre-fill the per-thread cache with. The default, or when set to zero,
215is no limit.
be8aa923 216@end deftp
d5c3fafc 217
c48d92b4
DD
218@deftp Tunable glibc.malloc.mxfast
219One of the optimizations malloc uses is to maintain a series of ``fast
220bins'' that hold chunks up to a specific size. The default and
221maximum size which may be held this way is 80 bytes on 32-bit systems
222or 160 bytes on 64-bit systems. Applications which value size over
223speed may choose to reduce the size of requests which are serviced
224from fast bins with this tunable. Note that the value specified
225includes malloc's internal overhead, which is normally the size of one
226pointer, so add 4 on 32-bit systems or 8 on 64-bit systems to the size
227passed to @code{malloc} for the largest bin size to enable.
228@end deftp
229
0c7b002f
SN
230@node Dynamic Linking Tunables
231@section Dynamic Linking Tunables
232@cindex dynamic linking tunables
233@cindex rtld tunables
234
235@deftp {Tunable namespace} glibc.rtld
236Dynamic linker behavior can be modified by setting the
237following tunables in the @code{rtld} namespace:
238@end deftp
239
240@deftp Tunable glibc.rtld.nns
241Sets the number of supported dynamic link namespaces (see @code{dlmopen}).
242Currently this limit can be set between 1 and 16 inclusive, the default is 4.
243Each link namespace consumes some memory in all thread, and thus raising the
244limit will increase the amount of memory each thread uses. Raising the limit
17796419
SN
245is useful when your application uses more than 4 dynamic link namespaces as
246created by @code{dlmopen} with an lmid argument of @code{LM_ID_NEWLM}.
247Dynamic linker audit modules are loaded in their own dynamic link namespaces,
248but they are not accounted for in @code{glibc.rtld.nns}. They implicitly
249increase the per-thread memory usage as necessary, so this tunable does
250not need to be changed to allow many audit modules e.g. via @env{LD_AUDIT}.
0c7b002f
SN
251@end deftp
252
ffb17e7b
SN
253@deftp Tunable glibc.rtld.optional_static_tls
254Sets the amount of surplus static TLS in bytes to allocate at program
255startup. Every thread created allocates this amount of specified surplus
256static TLS. This is a minimum value and additional space may be allocated
257for internal purposes including alignment. Optional static TLS is used for
258optimizing dynamic TLS access for platforms that support such optimizations
259e.g. TLS descriptors or optimized TLS access for POWER (@code{DT_PPC64_OPT}
260and @code{DT_PPC_OPT}). In order to make the best use of such optimizations
261the value should be as many bytes as would be required to hold all TLS
262variables in all dynamic loaded shared libraries. The value cannot be known
263by the dynamic loader because it doesn't know the expected set of shared
264libraries which will be loaded. The existing static TLS space cannot be
265changed once allocated at process startup. The default allocation of
266optional static TLS is 512 bytes and is allocated in every thread.
267@end deftp
268
269
07ed18d2
RA
270@node Elision Tunables
271@section Elision Tunables
272@cindex elision tunables
273@cindex tunables, elision
274
275@deftp {Tunable namespace} glibc.elision
276Contended locks are usually slow and can lead to performance and scalability
277issues in multithread code. Lock elision will use memory transactions to under
278certain conditions, to elide locks and improve performance.
279Elision behavior can be modified by setting the following tunables in
280the @code{elision} namespace:
281@end deftp
282
283@deftp Tunable glibc.elision.enable
284The @code{glibc.elision.enable} tunable enables lock elision if the feature is
285supported by the hardware. If elision is not supported by the hardware this
286tunable has no effect.
287
288Elision tunables are supported for 64-bit Intel, IBM POWER, and z System
289architectures.
290@end deftp
291
292@deftp Tunable glibc.elision.skip_lock_busy
293The @code{glibc.elision.skip_lock_busy} tunable sets how many times to use a
294non-transactional lock after a transactional failure has occurred because the
295lock is already acquired. Expressed in number of lock acquisition attempts.
296
297The default value of this tunable is @samp{3}.
298@end deftp
299
300@deftp Tunable glibc.elision.skip_lock_internal_abort
301The @code{glibc.elision.skip_lock_internal_abort} tunable sets how many times
302the thread should avoid using elision if a transaction aborted for any reason
303other than a different thread's memory accesses. Expressed in number of lock
304acquisition attempts.
305
306The default value of this tunable is @samp{3}.
307@end deftp
308
309@deftp Tunable glibc.elision.skip_lock_after_retries
310The @code{glibc.elision.skip_lock_after_retries} tunable sets how many times
311to try to elide a lock with transactions, that only failed due to a different
312thread's memory accesses, before falling back to regular lock.
313Expressed in number of lock elision attempts.
314
315This tunable is supported only on IBM POWER, and z System architectures.
316
317The default value of this tunable is @samp{3}.
318@end deftp
319
320@deftp Tunable glibc.elision.tries
321The @code{glibc.elision.tries} sets how many times to retry elision if there is
322chance for the transaction to finish execution e.g., it wasn't
323aborted due to the lock being already acquired. If elision is not supported
324by the hardware this tunable is set to @samp{0} to avoid retries.
325
326The default value of this tunable is @samp{3}.
327@end deftp
328
329@deftp Tunable glibc.elision.skip_trylock_internal_abort
330The @code{glibc.elision.skip_trylock_internal_abort} tunable sets how many
331times the thread should avoid trying the lock if a transaction aborted due to
332reasons other than a different thread's memory accesses. Expressed in number
333of try lock attempts.
334
335The default value of this tunable is @samp{3}.
336@end deftp
337
6310e6be
KW
338@node POSIX Thread Tunables
339@section POSIX Thread Tunables
340@cindex pthread mutex tunables
341@cindex thread mutex tunables
342@cindex mutex tunables
343@cindex tunables thread mutex
344
345@deftp {Tunable namespace} glibc.pthread
346The behavior of POSIX threads can be tuned to gain performance improvements
347according to specific hardware capabilities and workload characteristics by
348setting the following tunables in the @code{pthread} namespace:
349@end deftp
350
351@deftp Tunable glibc.pthread.mutex_spin_count
352The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
353a thread should spin on the lock before calling into the kernel to block.
354Adaptive spin is used for mutexes initialized with the
355@code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension. It affects both
356@code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.
357
358The thread spins until either the maximum spin count is reached or the lock
359is acquired.
360
361The default value of this tunable is @samp{100}.
362@end deftp
363
ea9b0ecb
SP
364@node Hardware Capability Tunables
365@section Hardware Capability Tunables
366@cindex hardware capability tunables
367@cindex hwcap tunables
368@cindex tunables, hwcap
03feacb5
L
369@cindex hwcaps tunables
370@cindex tunables, hwcaps
905947c3
L
371@cindex data_cache_size tunables
372@cindex tunables, data_cache_size
373@cindex shared_cache_size tunables
374@cindex tunables, shared_cache_size
375@cindex non_temporal_threshold tunables
376@cindex tunables, non_temporal_threshold
ea9b0ecb 377
dce452dc 378@deftp {Tunable namespace} glibc.cpu
ea9b0ecb 379Behavior of @theglibc{} can be tuned to assume specific hardware capabilities
dce452dc 380by setting the following tunables in the @code{cpu} namespace:
ea9b0ecb
SP
381@end deftp
382
dce452dc 383@deftp Tunable glibc.cpu.hwcap_mask
ea9b0ecb
SP
384This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is
385identical in features.
386
28c3f14f 387The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set
ea9b0ecb 388extensions available in the processor at runtime for some architectures. The
dce452dc 389@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those
ea9b0ecb
SP
390capabilities at runtime, thus disabling use of those extensions.
391@end deftp
905947c3 392
dce452dc
SP
393@deftp Tunable glibc.cpu.hwcaps
394The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
905947c3
L
395enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx}
396and @code{zzz} where the feature name is case-sensitive and has to match
397the ones in @code{sysdeps/x86/cpu-features.h}.
398
399This tunable is specific to i386 and x86-64.
400@end deftp
401
dce452dc
SP
402@deftp Tunable glibc.cpu.cached_memopt
403The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to
c9cd7b0c
AZ
404enable optimizations recommended for cacheable memory. If set to
405@code{1}, @theglibc{} assumes that the process memory image consists
406of cacheable (non-device) memory only. The default, @code{0},
407indicates that the process may use device memory.
408
409This tunable is specific to powerpc, powerpc64 and powerpc64le.
410@end deftp
411
dce452dc
SP
412@deftp Tunable glibc.cpu.name
413The @code{glibc.cpu.name=xxx} tunable allows the user to tell @theglibc{} to
28cfa3a4 414assume that the CPU is @code{xxx} where xxx may have one of these values:
9c9ec581 415@code{generic}, @code{falkor}, @code{thunderxt88}, @code{thunderx2t99},
0db8e7b3 416@code{thunderx2t99p1}, @code{ares}, @code{emag}, @code{kunpeng}.
28cfa3a4
SP
417
418This tunable is specific to aarch64.
419@end deftp
420
dce452dc
SP
421@deftp Tunable glibc.cpu.x86_data_cache_size
422The @code{glibc.cpu.x86_data_cache_size} tunable allows the user to set
905947c3
L
423data cache size in bytes for use in memory and string routines.
424
425This tunable is specific to i386 and x86-64.
426@end deftp
427
dce452dc
SP
428@deftp Tunable glibc.cpu.x86_shared_cache_size
429The @code{glibc.cpu.x86_shared_cache_size} tunable allows the user to
905947c3
L
430set shared cache size in bytes for use in memory and string routines.
431@end deftp
432
dce452dc
SP
433@deftp Tunable glibc.cpu.x86_non_temporal_threshold
434The @code{glibc.cpu.x86_non_temporal_threshold} tunable allows the user
905947c3
L
435to set threshold in bytes for non temporal store.
436
437This tunable is specific to i386 and x86-64.
438@end deftp
6d90776d 439
3f4b61a0
L
440@deftp Tunable glibc.cpu.x86_rep_movsb_threshold
441The @code{glibc.cpu.x86_rep_movsb_threshold} tunable allows the user to
442set threshold in bytes to start using "rep movsb". The value must be
443greater than zero, and currently defaults to 2048 bytes.
444
445This tunable is specific to i386 and x86-64.
446@end deftp
447
448@deftp Tunable glibc.cpu.x86_rep_stosb_threshold
449The @code{glibc.cpu.x86_rep_stosb_threshold} tunable allows the user to
450set threshold in bytes to start using "rep stosb". The value must be
451greater than zero, and currently defaults to 2048 bytes.
452
453This tunable is specific to i386 and x86-64.
454@end deftp
455
dce452dc
SP
456@deftp Tunable glibc.cpu.x86_ibt
457The @code{glibc.cpu.x86_ibt} tunable allows the user to control how
6d90776d
L
458indirect branch tracking (IBT) should be enabled. Accepted values are
459@code{on}, @code{off}, and @code{permissive}. @code{on} always turns
460on IBT regardless of whether IBT is enabled in the executable and its
461dependent shared libraries. @code{off} always turns off IBT regardless
462of whether IBT is enabled in the executable and its dependent shared
463libraries. @code{permissive} is the same as the default which disables
464IBT on non-CET executables and shared libraries.
465
466This tunable is specific to i386 and x86-64.
467@end deftp
468
dce452dc
SP
469@deftp Tunable glibc.cpu.x86_shstk
470The @code{glibc.cpu.x86_shstk} tunable allows the user to control how
6d90776d
L
471the shadow stack (SHSTK) should be enabled. Accepted values are
472@code{on}, @code{off}, and @code{permissive}. @code{on} always turns on
473SHSTK regardless of whether SHSTK is enabled in the executable and its
474dependent shared libraries. @code{off} always turns off SHSTK regardless
475of whether SHSTK is enabled in the executable and its dependent shared
476libraries. @code{permissive} changes how dlopen works on non-CET shared
477libraries. By default, when SHSTK is enabled, dlopening a non-CET shared
478library returns an error. With @code{permissive}, it turns off SHSTK
479instead.
480
481This tunable is specific to i386 and x86-64.
482@end deftp
This page took 0.155404 seconds and 5 git commands to generate.