This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH] [BZ #19679] [x86_64] Set DL_RUNIME_UNALIGNED_VEC_SIZE to 8
- From: "H.J. Lu" <hongjiu dot lu at intel dot com>
- To: GNU C Library <libc-alpha at sourceware dot org>
- Date: Fri, 19 Feb 2016 14:09:55 -0800
- Subject: [PATCH] [BZ #19679] [x86_64] Set DL_RUNIME_UNALIGNED_VEC_SIZE to 8
- Authentication-results: sourceware.org; auth=none
- Reply-to: "H.J. Lu" <hjl dot tools at gmail dot com>
Due to GCC bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066
__tls_get_addr may be called with 8-byte stack alignment. Although
this bug has been fixed in GCC 4.9.4, 5.3 and 6, we can't assume
that stack will be always aligned at 16 bytes. Since SSE optimized
memory/string functions with aligned SSE register load and store are
used in the dynamic linker, we must set DL_RUNIME_UNALIGNED_VEC_SIZE
to 8 so that _dl_runtime_resolve_sse will align the stack before
calling _dl_fixup:
Dump of assembler code for function _dl_runtime_resolve_sse:
0x00007ffff7deea90 <+0>: push %rbx
0x00007ffff7deea91 <+1>: mov %rsp,%rbx
0x00007ffff7deea94 <+4>: and $0xfffffffffffffff0,%rsp
^^^^^^^^^^^ Align stack to 16 bytes
0x00007ffff7deea98 <+8>: sub $0x100,%rsp
0x00007ffff7deea9f <+15>: mov %rax,0xc0(%rsp)
0x00007ffff7deeaa7 <+23>: mov %rcx,0xc8(%rsp)
0x00007ffff7deeaaf <+31>: mov %rdx,0xd0(%rsp)
0x00007ffff7deeab7 <+39>: mov %rsi,0xd8(%rsp)
0x00007ffff7deeabf <+47>: mov %rdi,0xe0(%rsp)
0x00007ffff7deeac7 <+55>: mov %r8,0xe8(%rsp)
0x00007ffff7deeacf <+63>: mov %r9,0xf0(%rsp)
0x00007ffff7deead7 <+71>: movaps %xmm0,(%rsp)
0x00007ffff7deeadb <+75>: movaps %xmm1,0x10(%rsp)
0x00007ffff7deeae0 <+80>: movaps %xmm2,0x20(%rsp)
0x00007ffff7deeae5 <+85>: movaps %xmm3,0x30(%rsp)
0x00007ffff7deeaea <+90>: movaps %xmm4,0x40(%rsp)
0x00007ffff7deeaef <+95>: movaps %xmm5,0x50(%rsp)
0x00007ffff7deeaf4 <+100>: movaps %xmm6,0x60(%rsp)
0x00007ffff7deeaf9 <+105>: movaps %xmm7,0x70(%rsp)
Tested on x86-64. Any comments?
H.J.
---
[BZ #19679]
* sysdeps/x86_64/dl-trampoline.S (DL_RUNIME_UNALIGNED_VEC_SIZE):
Set to 8.
---
sysdeps/x86_64/dl-trampoline.S | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/sysdeps/x86_64/dl-trampoline.S b/sysdeps/x86_64/dl-trampoline.S
index 9fb6b13..0a480b6 100644
--- a/sysdeps/x86_64/dl-trampoline.S
+++ b/sysdeps/x86_64/dl-trampoline.S
@@ -34,8 +34,11 @@
#endif
#ifndef DL_RUNIME_UNALIGNED_VEC_SIZE
-/* The maximum size of unaligned vector load and store. */
-# define DL_RUNIME_UNALIGNED_VEC_SIZE 16
+/* The maximum size of unaligned vector load and store in the dynamic
+ linker. Since SSE optimized memory/string functions with aligned
+ SSE register load and store are used in the dynamic linker, we must
+ set this to 8. */
+# define DL_RUNIME_UNALIGNED_VEC_SIZE 8
#endif
/* True if _dl_runtime_resolve should align stack to VEC_SIZE bytes. */
--
2.5.0