This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: Only enable TLE with PPC_FEATURE2_HTM_NOSC



On 20/09/2018 09:29, Adhemerval Zanella wrote:
> 
> 
> On 19/09/2018 16:20, Tulio Magno Quites Machado Filho wrote:
>> Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:
>>
>>> On 27/08/2018 19:44, Tulio Magno Quites Machado Filho wrote:
>>>> Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:
>>>>
>>>>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
>>>>> index 906882a..508b917 100644
>>>>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
>>>>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
>>>>> @@ -127,6 +127,30 @@ elision_init (int argc __attribute__ ((unused)),
>>>>>  	       TUNABLE_CALLBACK (set_elision_skip_trylock_internal_abort));
>>>>>  #endif
>>>>>  
>>>>> +  /* Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
>>>
>>> In fact it should be 4.1, 4.2 is the first release which actually abort transactions
>>> on syscalls.
>>
>> Ack.
>>
>>>>> +     instead it suspend and resume it when leaving the kernel.  The
>>>>> +     side-effects of the syscall will always remain visible, even if the
>>>>> +     transaction is aborted.  This is an issue when transaction is used along
>>>>> +     with futex syscall, on pthread_cond_wait for instance, where the futex
>>>>> +     call might succeed but the transaction is rolled back leading the
>>>>> +     pthread_cond object in an inconsistent state.
>>>>> +
>>>>> +     GLIBC used to prevent it by always aborting a transaction before issuing
>>>>> +     a syscall.  Linux 4.2 also decided to abort active transaction in
>>>>> +     syscalls which makes the GLIBC workaround superflours.  Worse, GLIBC
>>>>> +     transaction abortion leads to a performance issue on recent kernels
>>>>> +     where the HTM state is saved/restore lazily.  By aborting a transaction
>>>>> +     on every syscalls, regardless whether a transaction has being initiated
>>>>> +     before, glibc make the kernel always save/restore HTM state (it can not
>>>>> +     even lazily disable it after a certain number of syscall iterations).
>>>>> +
>>>>> +     Because of this shortcoming, Lock Elision is just enabled when it has
>>>>> +     been explictly set (either by tunables of by a configure switch) and if
>>>>> +     kernel aborts HTM transactions on syscalls (PPC_FEATURE2_HTM_NOSC)  */
>>>>> +
>>>>> +  __pthread_force_elision = __pthread_force_elision &&
>>>>> +			    GLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
>>>>
>>>> In other words: this patch is requiring Linux 4.3 for TLE.
>>>>
>>>> Can't ABORT_TRANSACTION check for PPC_FEATURE2_HTM_NOSC in tcbhead_t.hwcap and
>>>> continue to support older Linux versions?
>>>>
>>>
>>> I don't have a strong opinion here, but chatting with Breno I had the impression
>>> that TLE (and HTM in general) is only used on kernel with this recent fix. This
>>> patch is really a hack insertion that penalizes all syscall for an specific
>>> feature which may not be present in the kernel. It is unfortunate that kernel
>>> developers did not realize at feature inclusion this design for transaction 
>>> state is not the best one for library usage.
>>
>> My point was actually related to the fact that INSTALL still mentions Linux 3.2.
>>
>> I completely agree this solution is better in terms of performance and
>> maintenance in the long term.
>> Maybe we (me?) could document this problem somewhere (INSTALL?) pointing
>> users that want to run the latest glibc and use HTM on Linux to use
>> Linux >= 4.2.
> 
> The INSTALL file still lacks to mention that powerpc64le still requires
> at least a 3.10 kernel (which is the minimum version where the ABI is
> actually supported). And there still some other ABIs that does not
> have support for 3.2 (aarch64 for instance).
> 
> At least, I think we start to document powerpc64le requirements: a minimum
> Linux version of 3.10 and 4.2 for TLE.
> 
>>
>> With the removal of ABORT_TRANSACTION, it's also necessary to remove
>> sysdeps/unix/sysv/linux/powerpc/not-errno.h.
>>
>> With that said, my questions have already been answered and this patch looks
>> good to me with the fix in the comment you mentioned and the removal of
>> not-errno.h.> 
> 
> I will update the patch with this removal.
> 

Below is an updated patch, I decided to advertise the TLE support change
on NEWS instead of adding on INSTALL (it have information only for glibc
build/installation instead of expected runtime support).

---

	* NEWS: Add note about new TLE support on powerpc64le.
	* sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove.
	* sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to
	__ununsed1.
	(TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup.
	(THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros.
	* sysdeps/powerpc/powerpc32/sysdep.h,
	sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL,
	ABORT_TRANSACTION): Remove macros.
	* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set
	__pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h,
	sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
	sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove
	usage.
	* sysdeps/unix/sysv/linux/powerpc/not-errno.h: Remove file.

Reported-by: Breno Leitão <leitao@debian.org>

---

diff --git a/NEWS b/NEWS
index 53d7bd09b3..0701d2d10b 100644
--- a/NEWS
+++ b/NEWS
@@ -21,6 +21,15 @@ Major new features:
 * The reallocarray function is now declared under _DEFAULT_SOURCE, not just
   for _GNU_SOURCE, to match BSD environments.
 
+* For powercp64le ABI, Transactional Lock Elision is now enabled iff kernel
+  aborts the transaction prior kernel execution (PPC_FEATURE2_HTM_NOSC on
+  hwcap2).  On older kernels the transaction is suspended, and this caused
+  some undefined side-effects issues when transactions relies on other
+  synchronization mechanisms (futex for instance).  GLIBC avoided it by
+  abort transactions manually on each syscall, but it lead to performance
+  issues on newer kernels where the HTM state is saved and restore lazily
+  (the state being saved even when the process actually does not use HTM).
+
 Deprecated and removed features, and other changes affecting compatibility:
 
 * The glibc.tune tunable namespace has been renamed to glibc.cpu and the
diff --git a/sysdeps/powerpc/nptl/tcb-offsets.sym b/sysdeps/powerpc/nptl/tcb-offsets.sym
index e5bb2b376d..4c01615ad0 100644
--- a/sysdeps/powerpc/nptl/tcb-offsets.sym
+++ b/sysdeps/powerpc/nptl/tcb-offsets.sym
@@ -21,7 +21,6 @@ DSO_SLOT2			(offsetof (tcbhead_t, dso_slot2) - TLS_TCB_OFFSET - sizeof (tcbhead_
 #ifdef __powerpc64__
 TCB_AT_PLATFORM			(offsetof (tcbhead_t, at_platform) - TLS_TCB_OFFSET - sizeof(tcbhead_t))
 #endif
-TM_CAPABLE			(offsetof (tcbhead_t, tm_capable) - TLS_TCB_OFFSET - sizeof (tcbhead_t))
 #ifndef __powerpc64__
 TCB_AT_PLATFORM			(offsetof (tcbhead_t, at_platform) - TLS_TCB_OFFSET - sizeof(tcbhead_t))
 PADDING				(offsetof (tcbhead_t, padding) - TLS_TCB_OFFSET - sizeof(tcbhead_t))
diff --git a/sysdeps/powerpc/nptl/tls.h b/sysdeps/powerpc/nptl/tls.h
index f88fed5ecf..8317ca7cfa 100644
--- a/sysdeps/powerpc/nptl/tls.h
+++ b/sysdeps/powerpc/nptl/tls.h
@@ -67,8 +67,7 @@ typedef struct
   uint32_t padding;
   uint32_t at_platform;
 #endif
-  /* Indicate if HTM capable (ISA 2.07).  */
-  uint32_t tm_capable;
+  uint32_t __unused;
   /* Reservation for AT_PLATFORM data - powerpc64.  */
 #ifdef __powerpc64__
   uint32_t at_platform;
@@ -142,7 +141,6 @@ register void *__thread_register __asm__ ("r13");
 # define TLS_INIT_TP(tcbp) \
   ({ 									      \
     __thread_register = (void *) (tcbp) + TLS_TCB_OFFSET;		      \
-    THREAD_SET_TM_CAPABLE (__tcb_hwcap & PPC_FEATURE2_HAS_HTM ? 1 : 0);	      \
     THREAD_SET_HWCAP (__tcb_hwcap);					      \
     THREAD_SET_AT_PLATFORM (__tcb_platform);				      \
     NULL;								      \
@@ -151,8 +149,6 @@ register void *__thread_register __asm__ ("r13");
 /* Value passed to 'clone' for initialization of the thread register.  */
 # define TLS_DEFINE_INIT_TP(tp, pd) \
     void *tp = (void *) (pd) + TLS_TCB_OFFSET + TLS_PRE_TCB_SIZE;	      \
-    (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].tm_capable) =	      \
-      THREAD_GET_TM_CAPABLE ();						      \
     (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].hwcap) =	      \
       THREAD_GET_HWCAP ();						      \
     (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].at_platform) =	      \
@@ -210,13 +206,6 @@ register void *__thread_register __asm__ ("r13");
 		     + TLS_PRE_TCB_SIZE))[-1].pointer_guard		      \
      = THREAD_GET_POINTER_GUARD())
 
-/* tm_capable field in TCB head.  */
-# define THREAD_GET_TM_CAPABLE() \
-    (((tcbhead_t *) ((char *) __thread_register				      \
-		     - TLS_TCB_OFFSET))[-1].tm_capable)
-# define THREAD_SET_TM_CAPABLE(value) \
-    (THREAD_GET_TM_CAPABLE () = (value))
-
 /* hwcap field in TCB head.  */
 # define THREAD_GET_HWCAP() \
     (((tcbhead_t *) ((char *) __thread_register				      \
diff --git a/sysdeps/powerpc/powerpc32/sysdep.h b/sysdeps/powerpc/powerpc32/sysdep.h
index 5f1294ead3..93097c5459 100644
--- a/sysdeps/powerpc/powerpc32/sysdep.h
+++ b/sysdeps/powerpc/powerpc32/sysdep.h
@@ -90,24 +90,7 @@ GOT_LABEL:			;					      \
   cfi_endproc;								      \
   ASM_SIZE_DIRECTIVE(name)
 
-#if !IS_IN(rtld) && !defined(__SPE__)
-# define ABORT_TRANSACTION_IMPL \
-    cmpwi    2,0;		\
-    beq      1f;		\
-    lwz      0,TM_CAPABLE(2);	\
-    cmpwi    0,0;		\
-    beq	     1f;		\
-    li       11,_ABORT_SYSCALL;	\
-    tabort.  11;		\
-    .align 4;			\
-1:
-#else
-# define ABORT_TRANSACTION_IMPL
-#endif
-#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL
-
 #define DO_CALL(syscall)						      \
-    ABORT_TRANSACTION							      \
     li 0,syscall;							      \
     sc
 
diff --git a/sysdeps/powerpc/powerpc64/sysdep.h b/sysdeps/powerpc/powerpc64/sysdep.h
index 2df1d9b6e6..50e64f9ce2 100644
--- a/sysdeps/powerpc/powerpc64/sysdep.h
+++ b/sysdeps/powerpc/powerpc64/sysdep.h
@@ -263,24 +263,7 @@ LT_LABELSUFFIX(name,_name_end): ; \
   TRACEBACK_MASK(name,mask);	\
   END_2(name)
 
-#if !IS_IN(rtld)
-# define ABORT_TRANSACTION_IMPL \
-    cmpdi    13,0;		\
-    beq      1f;		\
-    lwz      0,TM_CAPABLE(13);	\
-    cmpwi    0,0;		\
-    beq	     1f;		\
-    li       11,_ABORT_SYSCALL;	\
-    tabort.  11;		\
-    .p2align 4;			\
-1:
-#else
-# define ABORT_TRANSACTION_IMPL
-#endif
-#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL
-
 #define DO_CALL(syscall) \
-    ABORT_TRANSACTION \
     li 0,syscall; \
     sc
 
diff --git a/sysdeps/powerpc/sysdep.h b/sysdeps/powerpc/sysdep.h
index 8a6d236caa..c8bf25e870 100644
--- a/sysdeps/powerpc/sysdep.h
+++ b/sysdeps/powerpc/sysdep.h
@@ -21,8 +21,6 @@
  */
 #define _SYSDEPS_SYSDEP_H 1
 #include <bits/hwcap.h>
-#include <tls.h>
-#include <htm.h>
 
 #define PPC_FEATURE_970 (PPC_FEATURE_POWER4 + PPC_FEATURE_HAS_ALTIVEC)
 
@@ -166,22 +164,4 @@
 #define ALIGNARG(log2) log2
 #define ASM_SIZE_DIRECTIVE(name) .size name,.-name
 
-#else
-
-/* Linux kernel powerpc documentation [1] states issuing a syscall inside a
-   transaction is not recommended and may lead to undefined behavior.  It
-   also states syscalls do not abort transactions.  To avoid such traps,
-   we abort transaction just before syscalls.
-
-   [1] Documentation/powerpc/transactional_memory.txt [Syscalls]  */
-#if !IS_IN(rtld) && !defined(__SPE__)
-# define ABORT_TRANSACTION \
-  ({ 						\
-    if (THREAD_GET_TM_CAPABLE ())		\
-      __libc_tabort (_ABORT_SYSCALL);	\
-  })
-#else
-# define ABORT_TRANSACTION
-#endif
-
 #endif	/* __ASSEMBLER__ */
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
index 906882a65e..60b19482bb 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
@@ -70,7 +70,7 @@ do_set_elision_enable (int32_t elision_enable)
 			       & PPC_FEATURE2_HAS_HTM) ? 1 : 0;
 }
 
-/* The pthread->elision_enable tunable is 0 or 1 indicating that elision
+/* The pthread->elision_enable tunable is 0 or 1 indicating that elisionGLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
    should be disabled or enabled respectively.  The feature will only be used
    if it's supported by the hardware.  */
 
@@ -127,6 +127,26 @@ elision_init (int argc __attribute__ ((unused)),
 	       TUNABLE_CALLBACK (set_elision_skip_trylock_internal_abort));
 #endif
 
+  /* Linux from 3.9 through 4.2 do not abort HTM transaction on syscalls,
+     instead it suspend the transaction and resumes it when returning to
+     usercode.  The side-effects of the syscall will always remain visible,
+     even if the transaction is aborted.  This is an issue when transaction
+     is used along with futex syscall, on pthread_cond_wait for instance,
+     where futex might succeed but the transaction is rolled back leading
+     the pthread_cond object in an inconsistent state.
+
+     GLIBC used to prevent it by always aborting a transaction before issuing
+     a syscall.  Linux 4.2 also decided to abort active transaction in
+     syscalls which makes the GLIBC workaround superflouts.  Worse, GLIBC
+     transaction abortion leads to a performance issue on recent kernels.
+
+     So Lock Elision is just enabled when it has been explict set (either
+     by tunables of by a configure switch) and if kernel aborts HTM
+     transactions on syscalls (PPC_FEATURE2_HTM_NOSC)  */
+
+  __pthread_force_elision = __pthread_force_elision &&
+			    GLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
+
   if (!__pthread_force_elision)
     __elision_aconf.try_tbegin = 0; /* Disable elision on rwlocks.  */
 }
diff --git a/sysdeps/unix/sysv/linux/powerpc/not-errno.h b/sysdeps/unix/sysv/linux/powerpc/not-errno.h
deleted file mode 100644
index 27da21bdf1..0000000000
--- a/sysdeps/unix/sysv/linux/powerpc/not-errno.h
+++ /dev/null
@@ -1,30 +0,0 @@
-/* Syscall wrapper that do not set errno.  Linux powerpc version.
-   Copyright (C) 2018 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-/* __access_noerrno is used during process initialization in elf/dl-tunables.c
-   before the TCB is initialized, prohibiting the usage of
-   ABORT_TRANSACTION.  */
-#undef ABORT_TRANSACTION
-#define ABORT_TRANSACTION
-
-#include "sysdeps/unix/sysv/linux/not-errno.h"
-
-/* Recover ABORT_TRANSACTION's previous value, in order to not affect
-   other syscalls.  */
-#undef ABORT_TRANSACTION
-#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h b/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
index f7277d59e1..ec5c5250f8 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
@@ -109,7 +109,6 @@
     register long int r11 __asm__ ("r11");				\
     register long int r12 __asm__ ("r12");				\
     LOADARGS_##nr(name, args);						\
-    ABORT_TRANSACTION;							\
     __asm__ __volatile__						\
       ("sc   \n\t"							\
        "mfcr %0"							\
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h b/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
index 0956cf04a7..1f17f7bd5f 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
@@ -131,7 +131,6 @@
     register long int r7  __asm__ ("r7");				\
     register long int r8  __asm__ ("r8");				\
     LOADARGS_##nr (name, ##args);					\
-    ABORT_TRANSACTION;							\
     __asm__ __volatile__						\
       ("sc\n\t"								\
        "mfcr  %0\n\t"							\
diff --git a/sysdeps/unix/sysv/linux/powerpc/syscall.S b/sysdeps/unix/sysv/linux/powerpc/syscall.S
index 2da91721be..bbab613aca 100644
--- a/sysdeps/unix/sysv/linux/powerpc/syscall.S
+++ b/sysdeps/unix/sysv/linux/powerpc/syscall.S
@@ -18,7 +18,6 @@
 #include <sysdep.h>
 
 ENTRY (syscall)
-	ABORT_TRANSACTION
 	mr   r0,r3
 	mr   r3,r4
 	mr   r4,r5
-- 
2.17.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]