This code block calls out to validate_str which calls to memrchr. The to= is the size of the entire section. I believe Address Sanitizer does validate the entire memory range. Passing the entire range to the memrchr null char check makes it very costly. 191 else if (likely (strscn->data_list_rear == NULL)) 192 { 193 // XXX The above is currently correct since elf_newdata will 194 // make sure to convert the rawdata into the datalist if 195 // necessary. But it would be more efficient to keep the rawdata 196 // unconverted and only then iterate over the rest of the (newly 197 // added data) list. Note that when the ELF file is mmapped 198 // rawdata_base can be set while rawdata.d hasn't been 199 // initialized yet (when data_read is zero). So we cannot just 200 // look at the rawdata.d.d_size. 201 202 /* Make sure the string is NUL terminated. Start from the end, 203 which very likely is a NUL char. */ 204 if (likely (validate_str (strscn->rawdata_base, offset, sh_size))) 205 result = &strscn->rawdata_base[offset]; 206 else 207 __libelf_seterrno (ELF_E_INVALID_INDEX); 208 } * libelf compiled with HAVE_DECL_MEMRCHR (default) * recent GCC toolchain (GCC6 and up) * files themselves don't even need to be compiled with asan, just enabling it at link time so the runtime wrapping/intercepting of libc et all fires. Testcase: test.c is attached. Generate a source file with dummy symbols to populate the symbol table. for i in {A..Z}{A..Z}{A..Z}{A..Z}; do echo "int sym$i;"; done > symbols.c gcc -c symbols.c -o symbols.o gcc -c test.c -o test.o -I /tmp/jan-test/elfutils/libelf/ #-fsanitize=address gcc test.o symbols.o -o test -l:libelf.a -L/tmp/jan-test/elfutils/libelf/ -lz -fsanitize=address echo -n "default asan:" time ./test test echo -n "asan disabled" gcc test.o symbols.o -o test -l:libelf.a -L/tmp/jan-test/elfutils/libelf/ -lz time ./test test I don't know of any ASAN_OPTIONS parameter that would change this behavior and make it less strict. On my machine the asan testcase takes 5+ sec, whereas the non-asan 0.04s. Can this code please be optimized to reduce the range passed to memrchr? Is this NUL check even required?
I think it really is a bug/performance issue in asan. But "optimizing" it in libelf by first checking the last char is zero, before calling memrchr wouldn't hurt (and should normally prevent a function call). Does the following help? diff --git a/libelf/elf_strptr.c b/libelf/elf_strptr.c index 76f2caf1..dc9b76c0 100644 --- a/libelf/elf_strptr.c +++ b/libelf/elf_strptr.c @@ -56,7 +56,9 @@ get_zdata (Elf_Scn *strscn) static bool validate_str (const char *str, size_t from, size_t to) { #if HAVE_DECL_MEMRCHR - return memrchr (&str[from], '\0', to - from) != NULL; + // Check end first, which is likely a zero terminator, to prevent function call + return (str[to - 1] == '\0' + || (to - from > 0 && memrchr (&str[from], '\0', to - from - 1) != NULL)); #else do { if (to <= from)
The patch optimizes perfectly and avoids the expensive call. Thanks!
(In reply to Jan Smets from comment #2) > The patch optimizes perfectly and avoids the expensive call. Thanks for testing, I pushed a variant of the fix as: commit 0aed4315b2f6c54f4efcf8a8d22e59a36e6eb30d Author: Mark Wielaard <mark@klomp.org> Date: Mon Jul 19 15:52:51 2021 +0200 libelf: Optimize elf_strptr.c validate_str by checking last char first In most cases the last char of the sectio will be zero. Check that first before calling memrchr. This is a minor optimization in normal cases. But it helps asan a lot by removing the memrchr call in most cases. https://sourceware.org/bugzilla/show_bug.cgi?id=28101 Signed-off-by: Mark Wielaard <mark@klomp.org>