Bug 15786 - ifunc resolver functions can smash function arguments
Summary: ifunc resolver functions can smash function arguments
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: 2.17
: P2 normal
Target Milestone: 2.23
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-07-25 22:51 UTC by Carl Worth
Modified: 2015-08-28 19:56 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Test case: library implementation (217 bytes, text/x-csrc)
2013-07-25 22:51 UTC, Carl Worth
Details
Test case: library header (73 bytes, text/x-chdr)
2013-07-25 22:51 UTC, Carl Worth
Details
Test case: main program (145 bytes, text/x-csrc)
2013-07-25 22:51 UTC, Carl Worth
Details
Test case: Makefile (212 bytes, text/plain)
2013-07-25 22:52 UTC, Carl Worth
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carl Worth 2013-07-25 22:51:06 UTC
Created attachment 7118 [details]
Test case: library implementation

In using ifunc resolver functions, I've found that the arguments to
the underlying function are sometimes corrupted by the code in the
resolver functions itself.

By debugging, I've found that the problem is that the xmm0-xmm7
registers (on x864_64) are not being saved prior to the call of the
ifunc resolver, so that if the ifunc resolver writes to these
registers, (and the underlying function is expecting parameters in
these registers), then the parameter values are corrupted.

I will attach to this bug report a small test case demonstrating the
problem.

Can you please provide me some guidance on what I should be doing to
avoid this problem? I can imagine that guidance taking one of several
forms (ordered from least-convenient to most-convenient from my point
of view as a user of ifunc):

1. Don't do that!

   Perhaps I'm calling code that I just shouldn't be calling within an
   ifunc resolver. If there are limitations to what code can reliably
   be used within an ifunc resolver can you please point me to
   documentation describing those limitations?

2. You can do that, but be careful...

   Perhaps I should be manually saving xmm0-xmm7 if I happen to be
   using code that touches these registers? Is there any easy way to
   do this? Something like a gcc attribute that let me decorate my
   function would be really nice here.

3. This is a bug in glibc

   It would be particularly nice if glibc would take the extra care to
   save/restore these registers around calls to ifunc resolvers. I can
   imagine that the dl code in glibc is careful to never touch
   xmm0-xmm7 itself, so that in other cases there's no need to save
   these. But since ifunc resolvers effectively allow users to insert
   arbitrary code into the middle of the dl-resolution code path, it
   seems to me that glibc should be saving/restoring things completely
   here.

For the attached test case, I've written a tiny main program that
calls one function in a tiny library. That tiny library implements a
single function with an ifunc resolver. The resolver calls printf
which ends up smashing the function's argument in xmm0.

The test case includes a simple Makefile. An example session of
compiling the program, and running it to demonstrate the bug is below:

	$ make
	cc -g -Wall -Wextra -Wmissing-declarations -Werror=attributes -c smash-xmm.c -o smash-xmm.o
	cc -g -Wall -Wextra -Wmissing-declarations -Werror=attributes -fPIC -shared -Wl,-Bsymbolic -o libmylib.so mylib.c
	cc -g -Wall -Wextra -Wmissing-declarations -Werror=attributes smash-xmm.o -L . -lmylib -o smash-xmm

	$ LD_LIBRARY_PATH=. ./smash-xmm
	Calling mylib_print_float with value: 42
	This printf (and underlying __strchrnul) smashes xmm registers
	mylib_print_float called with value: 0

The debugging I did to determine the problem is as follows.

My ifunc resolver is called by code from glibc's dl-trampoline.S as follows:

_dl_runtime_resolve:
	cfi_adjust_cfa_offset(16) # Incorporate PLT
	subq $56,%rsp
	cfi_adjust_cfa_offset(56)
	movq %rax,(%rsp)	# Preserve registers otherwise clobbered.
	movq %rcx, 8(%rsp)
	movq %rdx, 16(%rsp)
	movq %rsi, 24(%rsp)
	movq %rdi, 32(%rsp)
	movq %r8, 40(%rsp)
	movq %r9, 48(%rsp)
	movq 64(%rsp), %rsi	# Copy args pushed by PLT in register.
	movq 56(%rsp), %rdi	# %rdi: link_map, %rsi: reloc_index
	call _dl_fixup		# Call resolver.

Note that rcx, rdx, rsi, rdi, r8 and r9 are saved. I checked at
http://en.wikipedia.org/wiki/X86_calling_conventions that the System V
AMD64 ABI for x86-64 requires saving xmm0-xmm7 in addition to these
registers.

I also verified that in my original program, (from which my test case
was distilled), that my function's arguments exist in xmm0, xmm1, and
xmm2.

Finally, I stepped through the printf call in my ifunc resolver to
find the following code which smashes the xmm register values:

ENTRY (__strchrnul)
	movd	%esi, %xmm1
	movq	%rdi, %rcx
	punpcklbw %xmm1, %xmm1
	andq	$~15, %rdi
	pxor	%xmm2, %xmm2
	punpcklbw %xmm1, %xmm1
	orl	$0xffffffff, %esi
	movdqa	(%rdi), %xmm0
	pshufd	$0, %xmm1, %xmm1
	subq	%rdi, %rcx
	movdqa	%xmm0, %xmm3

Now, while I don't strictly need to call printf from within my ifunc
resolver, I was surprised to find my program mysteriously misbehaving
when I did so. And in the meantime, I'm left feeling quite uncertain
about what code I can safely call within the ifunc resolver without
causing similar problems.

Thanks in advance for your time.

-Carl
Comment 1 Carl Worth 2013-07-25 22:51:39 UTC
Created attachment 7119 [details]
Test case: library header
Comment 2 Carl Worth 2013-07-25 22:51:56 UTC
Created attachment 7120 [details]
Test case: main program
Comment 3 Carl Worth 2013-07-25 22:52:17 UTC
Created attachment 7121 [details]
Test case: Makefile
Comment 4 Rich Felker 2013-07-26 07:29:14 UTC
This is definitely a bug in gcc. It is impossible to write "a function that does not clobber register X" without writing pure assembly, because the compiler is free to use any non-call-saved register for any purpose it likes. Even a simple for loop that performs copying might get optimized to use vector registers.

Moreover, I believe this bug is related to existing bug reports (I'm not sure of their status) for the non-ifunc resolver. If I'm not mistaken, right now, it's tiptoeing around the vector registers by avoiding calling certain string functions. This is of course wrong because it's making assumptions about the compiler's choice of register usage in code generation. Both issues would be fixed, and the code would cease to be senselessly fragile, if the asm entry point for the resolver simply saved and restored all call-clobbered registers like it should.
Comment 5 Rich Felker 2013-07-26 07:29:57 UTC
Erm, the beginning of that last comment was supposed to read "a bug in glibc", not "a bug in gcc". Sorry if it caused any confusion.
Comment 6 Ondrej Bilka 2013-07-26 07:38:06 UTC
On Fri, Jul 26, 2013 at 07:29:14AM +0000, bugdal at aerifal dot cx wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=15786
> 
> Rich Felker <bugdal at aerifal dot cx> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |bugdal at aerifal dot cx
> 
> --- Comment #4 from Rich Felker <bugdal at aerifal dot cx> ---
> This is definitely a bug in gcc. It is impossible to write "a function that
> does not clobber register X" without writing pure assembly, because the
> compiler is free to use any non-call-saved register for any purpose it likes.
> Even a simple for loop that performs copying might get optimized to use vector
> registers.
> 
> Moreover, I believe this bug is related to existing bug reports (I'm not sure
> of their status) for the non-ifunc resolver. If I'm not mistaken, right now,
> it's tiptoeing around the vector registers by avoiding calling certain string
> functions. This is of course wrong because it's making assumptions about the
> compiler's choice of register usage in code generation. Both issues would be
> fixed, and the code would cease to be senselessly fragile, if the asm entry
> point for the resolver simply saved and restored all call-clobbered registers
> like it should.
>
Another issue caused by not saving floating point registers. I will
sumbit patch to save fp registers.
Comment 7 H.J. Lu 2015-08-28 19:56:06 UTC
Fixed by

commit f3dcae82d54e5097e18e1d6ef4ff55c2ea4e621e
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Aug 25 04:33:54 2015 -0700

    Save and restore vector registers in x86-64 ld.so