This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: Only enable TLE with PPC_FEATURE2_HTM_NOSC


Hi Zanella,

On 08/27/2018 05:54 PM, Adhemerval Zanella wrote:
Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
instead it suspend and resume it when leaving the kernel.  The
side-effects of the syscall will always remain visible, even if the
transaction is aborted.  This is an issue when transaction is used along
with futex syscall, on pthread_cond_wait for instance, where the futex
call might succeed but the transaction is rolled back leading the
pthread_cond object in an inconsistent state.

GLIBC used to prevent it by always aborting a transaction before issuing
a syscall.  Linux 4.2 also decided to abort active transaction in
syscalls which makes the GLIBC workaround superflours.  Worse, GLIBC
transaction abortion leads to a performance issue on recent kernels
where the HTM state is saved/restore lazilyi (v4.9).  By aborting a
transaction on every syscalls, regardless whether a transaction has being
initiated before, GLIBS makes the kernel always save/restore HTM state
(it can not even lazily disable it after a certain number of syscall
iterations).

Because of this shortcoming, Lock Elision is just enabled when it has
been explictly set (either by tunables of by a configure switch) and if
kernel aborts HTM transactions on syscalls (PPC_FEATURE2_HTM_NOSC).  It
is reported that using simple benchmark [1], the context-switch is about
5% faster by not issuing a tabort in every syscall in newer kernels.

Checked on powerpc64le-linux-gnu with 4.4.0 kernel (Ubuntu 16.04).

Thanks a lot for addressing this issue.

Since the lazy mechanism was introduced in 4.9:

commit 5d176f751ee3c6eededd984ad409bff201f436a7
Author: Cyril Bur <cyrilbur@gmail.com>
Date:   Wed Sep 14 18:02:16 2016 +1000

    powerpc: tm: Enable transactional memory (TM) lazily for userspace

I think it's not a bad idea to mention that it was also tested on upstream
kernel (4.18).

LGTM.

Only some nits:

s/Lock Elision/Transactional Lock Elision/
s/superflours/superfluous/
s/lazilyi/lazily/
s/explictly/explicitly/

Acked-by: Gustavo Romero <gromero@linux.ibm.com>


Best regards,
Gustavo

	* sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove.
	* sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to
	__ununsed1.
	(TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup.
	(THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros.
	* sysdeps/powerpc/powerpc32/sysdep.h,
	sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL,
	ABORT_TRANSACTION): Remove macros.
	* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set
	__pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h,
	sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
	sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove
	usage.

Reported-by: Breno Leitão <leitao@debian.org>

[1] https://github.com/gromero/tabort
---
  ChangeLog                                          | 18 ++++++++++++++++
  sysdeps/powerpc/nptl/tcb-offsets.sym               |  1 -
  sysdeps/powerpc/nptl/tls.h                         | 13 +-----------
  sysdeps/powerpc/powerpc32/sysdep.h                 | 17 ---------------
  sysdeps/powerpc/powerpc64/sysdep.h                 | 17 ---------------
  sysdeps/powerpc/sysdep.h                           | 20 ------------------
  sysdeps/unix/sysv/linux/powerpc/elision-conf.c     | 24 ++++++++++++++++++++++
  sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h |  1 -
  sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h |  1 -
  sysdeps/unix/sysv/linux/powerpc/syscall.S          |  1 -
  10 files changed, 43 insertions(+), 70 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 157689f..8bf4029 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,21 @@
+2018-08-27  Adhemerval Zanella  <adhemerval.zanella@linaro.org>
+
+	* sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove.
+	* sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to
+	__ununsed1.
+	(TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup.
+	(THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros.
+	* sysdeps/powerpc/powerpc32/sysdep.h,
+	sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL,
+	ABORT_TRANSACTION): Remove macros.
+	* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
+	* sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set
+	__pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set.
+	* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h,
+	sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
+	sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove
+	usage.
+
  2018-08-27  Joseph Myers  <joseph@codesourcery.com>

  	* sysdeps/generic/math-tests-trap-force.h: New file.
diff --git a/sysdeps/powerpc/nptl/tcb-offsets.sym b/sysdeps/powerpc/nptl/tcb-offsets.sym
index e5bb2b3..4c01615 100644
--- a/sysdeps/powerpc/nptl/tcb-offsets.sym
+++ b/sysdeps/powerpc/nptl/tcb-offsets.sym
@@ -21,7 +21,6 @@ DSO_SLOT2			(offsetof (tcbhead_t, dso_slot2) - TLS_TCB_OFFSET - sizeof (tcbhead_
  #ifdef __powerpc64__
  TCB_AT_PLATFORM			(offsetof (tcbhead_t, at_platform) - TLS_TCB_OFFSET - sizeof(tcbhead_t))
  #endif
-TM_CAPABLE			(offsetof (tcbhead_t, tm_capable) - TLS_TCB_OFFSET - sizeof (tcbhead_t))
  #ifndef __powerpc64__
  TCB_AT_PLATFORM			(offsetof (tcbhead_t, at_platform) - TLS_TCB_OFFSET - sizeof(tcbhead_t))
  PADDING				(offsetof (tcbhead_t, padding) - TLS_TCB_OFFSET - sizeof(tcbhead_t))
diff --git a/sysdeps/powerpc/nptl/tls.h b/sysdeps/powerpc/nptl/tls.h
index f88fed5..8db970d 100644
--- a/sysdeps/powerpc/nptl/tls.h
+++ b/sysdeps/powerpc/nptl/tls.h
@@ -67,8 +67,7 @@ typedef struct
    uint32_t padding;
    uint32_t at_platform;
  #endif
-  /* Indicate if HTM capable (ISA 2.07).  */
-  uint32_t tm_capable;
+  uint32_t __unused1;
    /* Reservation for AT_PLATFORM data - powerpc64.  */
  #ifdef __powerpc64__
    uint32_t at_platform;
@@ -142,7 +141,6 @@ register void *__thread_register __asm__ ("r13");
  # define TLS_INIT_TP(tcbp) \
    ({ 									      \
      __thread_register = (void *) (tcbp) + TLS_TCB_OFFSET;		      \
-    THREAD_SET_TM_CAPABLE (__tcb_hwcap & PPC_FEATURE2_HAS_HTM ? 1 : 0);	      \
      THREAD_SET_HWCAP (__tcb_hwcap);					      \
      THREAD_SET_AT_PLATFORM (__tcb_platform);				      \
      NULL;								      \
@@ -151,8 +149,6 @@ register void *__thread_register __asm__ ("r13");
  /* Value passed to 'clone' for initialization of the thread register.  */
  # define TLS_DEFINE_INIT_TP(tp, pd) \
      void *tp = (void *) (pd) + TLS_TCB_OFFSET + TLS_PRE_TCB_SIZE;	      \
-    (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].tm_capable) =	      \
-      THREAD_GET_TM_CAPABLE ();						      \
      (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].hwcap) =	      \
        THREAD_GET_HWCAP ();						      \
      (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].at_platform) =	      \
@@ -210,13 +206,6 @@ register void *__thread_register __asm__ ("r13");
  		     + TLS_PRE_TCB_SIZE))[-1].pointer_guard		      \
       = THREAD_GET_POINTER_GUARD())

-/* tm_capable field in TCB head.  */
-# define THREAD_GET_TM_CAPABLE() \
-    (((tcbhead_t *) ((char *) __thread_register				      \
-		     - TLS_TCB_OFFSET))[-1].tm_capable)
-# define THREAD_SET_TM_CAPABLE(value) \
-    (THREAD_GET_TM_CAPABLE () = (value))
-
  /* hwcap field in TCB head.  */
  # define THREAD_GET_HWCAP() \
      (((tcbhead_t *) ((char *) __thread_register				      \
diff --git a/sysdeps/powerpc/powerpc32/sysdep.h b/sysdeps/powerpc/powerpc32/sysdep.h
index 5f1294e..93097c5 100644
--- a/sysdeps/powerpc/powerpc32/sysdep.h
+++ b/sysdeps/powerpc/powerpc32/sysdep.h
@@ -90,24 +90,7 @@ GOT_LABEL:			;					      \
    cfi_endproc;								      \
    ASM_SIZE_DIRECTIVE(name)

-#if !IS_IN(rtld) && !defined(__SPE__)
-# define ABORT_TRANSACTION_IMPL \
-    cmpwi    2,0;		\
-    beq      1f;		\
-    lwz      0,TM_CAPABLE(2);	\
-    cmpwi    0,0;		\
-    beq	     1f;		\
-    li       11,_ABORT_SYSCALL;	\
-    tabort.  11;		\
-    .align 4;			\
-1:
-#else
-# define ABORT_TRANSACTION_IMPL
-#endif
-#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL
-
  #define DO_CALL(syscall)						      \
-    ABORT_TRANSACTION							      \
      li 0,syscall;							      \
      sc

diff --git a/sysdeps/powerpc/powerpc64/sysdep.h b/sysdeps/powerpc/powerpc64/sysdep.h
index 2df1d9b..50e64f9 100644
--- a/sysdeps/powerpc/powerpc64/sysdep.h
+++ b/sysdeps/powerpc/powerpc64/sysdep.h
@@ -263,24 +263,7 @@ LT_LABELSUFFIX(name,_name_end): ; \
    TRACEBACK_MASK(name,mask);	\
    END_2(name)

-#if !IS_IN(rtld)
-# define ABORT_TRANSACTION_IMPL \
-    cmpdi    13,0;		\
-    beq      1f;		\
-    lwz      0,TM_CAPABLE(13);	\
-    cmpwi    0,0;		\
-    beq	     1f;		\
-    li       11,_ABORT_SYSCALL;	\
-    tabort.  11;		\
-    .p2align 4;			\
-1:
-#else
-# define ABORT_TRANSACTION_IMPL
-#endif
-#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL
-
  #define DO_CALL(syscall) \
-    ABORT_TRANSACTION \
      li 0,syscall; \
      sc

diff --git a/sysdeps/powerpc/sysdep.h b/sysdeps/powerpc/sysdep.h
index 8a6d236..c8bf25e 100644
--- a/sysdeps/powerpc/sysdep.h
+++ b/sysdeps/powerpc/sysdep.h
@@ -21,8 +21,6 @@
   */
  #define _SYSDEPS_SYSDEP_H 1
  #include <bits/hwcap.h>
-#include <tls.h>
-#include <htm.h>

  #define PPC_FEATURE_970 (PPC_FEATURE_POWER4 + PPC_FEATURE_HAS_ALTIVEC)

@@ -166,22 +164,4 @@
  #define ALIGNARG(log2) log2
  #define ASM_SIZE_DIRECTIVE(name) .size name,.-name

-#else
-
-/* Linux kernel powerpc documentation [1] states issuing a syscall inside a
-   transaction is not recommended and may lead to undefined behavior.  It
-   also states syscalls do not abort transactions.  To avoid such traps,
-   we abort transaction just before syscalls.
-
-   [1] Documentation/powerpc/transactional_memory.txt [Syscalls]  */
-#if !IS_IN(rtld) && !defined(__SPE__)
-# define ABORT_TRANSACTION \
-  ({ 						\
-    if (THREAD_GET_TM_CAPABLE ())		\
-      __libc_tabort (_ABORT_SYSCALL);	\
-  })
-#else
-# define ABORT_TRANSACTION
-#endif
-
  #endif	/* __ASSEMBLER__ */
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
index 906882a..508b917 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
@@ -127,6 +127,30 @@ elision_init (int argc __attribute__ ((unused)),
  	       TUNABLE_CALLBACK (set_elision_skip_trylock_internal_abort));
  #endif

+  /* Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
+     instead it suspend and resume it when leaving the kernel.  The
+     side-effects of the syscall will always remain visible, even if the
+     transaction is aborted.  This is an issue when transaction is used along
+     with futex syscall, on pthread_cond_wait for instance, where the futex
+     call might succeed but the transaction is rolled back leading the
+     pthread_cond object in an inconsistent state.
+
+     GLIBC used to prevent it by always aborting a transaction before issuing
+     a syscall.  Linux 4.2 also decided to abort active transaction in
+     syscalls which makes the GLIBC workaround superflours.  Worse, GLIBC
+     transaction abortion leads to a performance issue on recent kernels
+     where the HTM state is saved/restore lazily.  By aborting a transaction
+     on every syscalls, regardless whether a transaction has being initiated
+     before, glibc make the kernel always save/restore HTM state (it can not
+     even lazily disable it after a certain number of syscall iterations).
+
+     Because of this shortcoming, Lock Elision is just enabled when it has
+     been explictly set (either by tunables of by a configure switch) and if
+     kernel aborts HTM transactions on syscalls (PPC_FEATURE2_HTM_NOSC)  */
+
+  __pthread_force_elision = __pthread_force_elision &&
+			    GLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
+
    if (!__pthread_force_elision)
      __elision_aconf.try_tbegin = 0; /* Disable elision on rwlocks.  */
  }
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h b/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
index f7277d5..ec5c525 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
@@ -109,7 +109,6 @@
      register long int r11 __asm__ ("r11");				\
      register long int r12 __asm__ ("r12");				\
      LOADARGS_##nr(name, args);						\
-    ABORT_TRANSACTION;							\
      __asm__ __volatile__						\
        ("sc   \n\t"							\
         "mfcr %0"							\
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h b/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
index 0956cf0..1f17f7b 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
@@ -131,7 +131,6 @@
      register long int r7  __asm__ ("r7");				\
      register long int r8  __asm__ ("r8");				\
      LOADARGS_##nr (name, ##args);					\
-    ABORT_TRANSACTION;							\
      __asm__ __volatile__						\
        ("sc\n\t"								\
         "mfcr  %0\n\t"							\
diff --git a/sysdeps/unix/sysv/linux/powerpc/syscall.S b/sysdeps/unix/sysv/linux/powerpc/syscall.S
index 2da9172..bbab613 100644
--- a/sysdeps/unix/sysv/linux/powerpc/syscall.S
+++ b/sysdeps/unix/sysv/linux/powerpc/syscall.S
@@ -18,7 +18,6 @@
  #include <sysdep.h>

  ENTRY (syscall)
-	ABORT_TRANSACTION
  	mr   r0,r3
  	mr   r3,r4
  	mr   r4,r5



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]