This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/8] Add the low level infrastructure for pthreads lockelision with TSX


Andi,

This is looking much better. I have reviewed from 1/8 to 6/8,
and I'm still looking at 7/8 and 8/8. I will be posted my
review for the other patches one-by-one.

Please take the time to answer my questions that follow and
explicitly acknowledge my comments in the patch below.

Torvald,

You and I had discussed that it might be problematic to have
knobs that change semantics, but at the end of the day I've
found at least one other example: fast libm, which will need
the same kind of semantic changing knob in order to enable.

Therefore I can't draw a hard line saying that the library
semantics can't be changed at runtime. However, the hard line
I will draw is this: default library runtime behaviour must
not change existing semantics.

On 01/31/2013 05:29 PM, Andi Kleen wrote:
> Lock elision can be enabled/disabled using environment variables.
> It can be also enabled or disabled using new lock types for
> mutex and rwlocks. The adaptation parameters are also tunable.

It is no longer true that you can enable or disable using new
lock types? Is this simply a cut-and-paste error from your previous
version of the text where this was true?
 
> Changes with the RTM mutexes:
> -----------------------------
> Lock elision in pthreads is generally compatible with existing programs.
> There are some obscure exceptions, which are expected to be uncommon.
> See the manual for more details.

The more and more that I review this the more I think that for
the *first* implementation we are going to need to be conservative.
This includes no semantic changes to the existing POSIX API in the
default configuration e.g. no forced elision.

However, I will lean very close to the letter of POSIX and allow
the elision implementation to lean on undefined behaviour in order
to achieve better performance.

> - A broken program that unlocks a free lock will crash.
>   There are ways around this with some tradeoffs (more code in hot paths)
>   This will also happen on systems without RTM with the patchkit.
>   I'm still undecided on what approach to take here; have to wait for testing reports.

This is OK, this is undefined behaviour.

> - pthread_mutex_destroy of a lock mutex will not return EBUSY but 0.

This is OK. It is an implementation QoI issue to detect a locked mutex
and return EBUSY.

> - mutex appears free when elided.
>   pthread_mutex_lock(mutex);
>   if (pthread_mutex_trylock(mutex) != 0) do_something
>   will not do something when the lock elided.
>   However note that if the check is an assert it works as expected because the
>   assert failure aborts and the region is re-executed non transactionally,
>   with the old behaviour.
>   The same change applies to write locks for rwlocks.
>   [This is now only done for mutexes that have elision explicitely enabled,
>    standard mutexes abort in this situation]

If I understand correctly then:

(a) If elision is forced on then trylock does not abort the transaction
    and simply returns that the lock us unlocked.
(b) Elision forced on is not the default and must be set via the environment
    variables.
(c) The default will continue to support trylock with the existing 
    POSIX semantics.

Is that correct?

> - There's also a similar situation with trylock outside the mutex,
>   "knowing" that the mutex must be held due to some other condition.
>   In this case an assert failure cannot be recovered. This situation is
>   usually an existing bug in the program.

Have you provided an example of this for review? I'm having a hard
time coming up with this example.

> - Same applies to the rwlocks. Some of the return values changes
>   (for example there is no EDEADLK for an elided lock, unless it aborts.
>    However when elided it will also never deadlock of course)

As far as I can tell this is OK because it is not a semantic change
in the API?

> - Timing changes, so broken programs that make assumptions about specific timing
>   may expose already existing latent problems.  Note that these broken programs will
>   break in other situations too (loaded system, new faster hardware, compiler
>   optimizations etc.)

This is OK.

> Currently elision is enabled by default on systems that support RTM,
> unless explicitely disabled either in the program or by the user.

Enabled by default is not considered forced on?

Once forced on the user has requested a change in semantic for the POSIX API?

> This patch implements the basic infrastructure for elision.
> 
> Open issues:
> - XTEST or not XTEST in unlock, see above.
> - Adaptation for rwlocks
> - Condition variables don't use elision so far
> - Adaptation tuning
> 
> 2013-01-30  Andi Kleen  <ak@linux.intel.com>
>             Hongjiu Lu  <hongjiu.lu@intel.com>
> 
> 	* nptl-init.c (__pthread_force_elision): Add.
> 	* pthreadP.h (__pthread_force_elision): Add.
> 	* sysdeps/unix/sysv/linux/i386/lowlevellock.h (__lll_timedwait_tid,
>           lll_timedlock_elision, __lll_lock_elision, __lll_unlock_elision,
>           __lll_trylock_elision, lll_lock_elision, lll_unlock_elision,
> 	  lll_trylock_elision): Add.
> 	* sysdeps/unix/sysv/linux/x86/Makefile: Imply x86
> 	* sysdeps/unix/sysv/linux/x86/elision-conf.c: New file.
> 	* sysdeps/unix/sysv/linux/x86/elision-conf.h: New file.
> 	* sysdeps/unix/sysv/linux/x86/elision-lock.c: New file.
> 	* sysdeps/unix/sysv/linux/x86/elision-timed.c: New file.
> 	* sysdeps/unix/sysv/linux/x86/elision-trylock.c: New file.
> 	* sysdeps/unix/sysv/linux/x86/elision-unlock.c: New file
> 	* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h (__lll_timedwait_tid,
>           lll_timedlock_elision, __lll_lock_elision, __lll_unlock_elision,
>           __lll_trylock_elision, lll_lock_elision, lll_unlock_elision,
> 	  lll_trylock_elision): Add.
> 	* nptl/sysdeps/unix/sysv/linux/x86/hle.h: New file.
> 	* elision-conf.h: New file.

>  nptl/elision-conf.h                                |    1 +
>  nptl/nptl-init.c                                   |    1 +
>  nptl/pthreadP.h                                    |    2 +
>  nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h   |   22 ++
>  nptl/sysdeps/unix/sysv/linux/x86/Makefile          |    3 +
>  nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c    |  225 ++++++++++++++++++++
>  nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h    |   52 +++++
>  nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c    |   91 ++++++++
>  nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c   |    8 +
>  nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c |   70 ++++++
>  nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c  |   32 +++
>  nptl/sysdeps/unix/sysv/linux/x86/hle.h             |   75 +++++++
>  nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h |   23 ++
>  13 files changed, 605 insertions(+), 0 deletions(-)
>  create mode 100644 nptl/elision-conf.h
>  create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/Makefile
>  create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c
>  create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h
>  create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
>  create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c
>  create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c
>  create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c
>  create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/hle.h
> 
> diff --git a/nptl/elision-conf.h b/nptl/elision-conf.h
> new file mode 100644
> index 0000000..40a8c17
> --- /dev/null
> +++ b/nptl/elision-conf.h
> @@ -0,0 +1 @@
> +/* empty */
> diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
> index 19e6616..cc80549 100644
> --- a/nptl/nptl-init.c
> +++ b/nptl/nptl-init.c
> @@ -36,6 +36,7 @@
>  #include <lowlevellock.h>
>  #include <kernel-features.h>
>  
> +int __pthread_force_elision attribute_hidden;
>  
>  /* Size and alignment of static TLS block.  */
>  size_t __static_tls_size;
> diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
> index 993a79e..17973b2 100644
> --- a/nptl/pthreadP.h
> +++ b/nptl/pthreadP.h
> @@ -571,6 +571,8 @@ extern void __free_stacks (size_t limit) attribute_hidden;
>  
>  extern void __wait_lookup_done (void) attribute_hidden;
>  
> +extern int __pthread_force_elision attribute_hidden;
> +
>  #ifdef SHARED
>  # define PTHREAD_STATIC_FN_REQUIRE(name)
>  #else
> diff --git a/nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h b/nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h
> index f51f650..d2ef7de 100644
> --- a/nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h
> +++ b/nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h
> @@ -429,6 +429,12 @@ LLL_STUB_UNWIND_INFO_END
>  		       : "memory");					      \
>       result; })
>  
> +extern int __lll_timedlock_elision (int *futex, short *try_lock,
> +					 const struct timespec *timeout,
> +					 int private) attribute_hidden;
> +
> +#define lll_timedlock_elision(futex, try_lock, timeout, private)	\
> +  __lll_timedlock_elision(&(futex), &(try_lock), timeout, private)
>  
>  #define lll_robust_timedlock(futex, timeout, id, private) \
>    ({ int result, ignore1, ignore2, ignore3;				      \
> @@ -582,6 +588,22 @@ extern int __lll_timedwait_tid (int *tid, const struct timespec *abstime)
>        }									      \
>      __result; })
>  
> +extern int __lll_lock_elision (int *futex, short *try_lock, int private)
> +  attribute_hidden;
> +
> +extern int __lll_unlock_elision(int *lock, int private)
> +  attribute_hidden;
> +
> +extern int __lll_trylock_elision(int *lock, short *try_lock, int upgrade)
> +  attribute_hidden;
> +
> +#define lll_lock_elision(futex, try_lock, private) \
> +  __lll_lock_elision (&(futex), &(try_lock), private)
> +#define lll_unlock_elision(futex, private) \
> +  __lll_unlock_elision (&(futex), private)
> +#define lll_trylock_elision(futex, try_lock, upgrade) \
> +  __lll_trylock_elision(&(futex), &(try_lock), upgrade)
> +
>  #endif  /* !__ASSEMBLER__ */
>  
>  #endif	/* lowlevellock.h */
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86/Makefile b/nptl/sysdeps/unix/sysv/linux/x86/Makefile
> new file mode 100644
> index 0000000..61b7552
> --- /dev/null
> +++ b/nptl/sysdeps/unix/sysv/linux/x86/Makefile
> @@ -0,0 +1,3 @@
> +libpthread-sysdep_routines += init-arch
> +libpthread-sysdep_routines += elision-lock elision-unlock elision-timed \
> +			      elision-trylock
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c
> new file mode 100644
> index 0000000..4f9949a
> --- /dev/null
> +++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c
> @@ -0,0 +1,225 @@
> +/* elision-conf.c: Lock elision tunable parameters.
> +   Copyright (C) 2013 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>. */
> +#include <pthreadP.h>
> +#include <sys/fcntl.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <init-arch.h>
> +#include "elision-conf.h"

Is there any reason you don't use <elision-conf.h> and allow the sysdeps
selection mechanism to choose elision-conf.h as appropriate? This would
allow later developers to perhaps override this with their own copies
using a ports style addon.

> +
> +struct elision_config __elision_aconf =
> +  {
> +    .retry_lock_busy = 3,
> +    .retry_lock_internal_abort = 3,
> +    .retry_try_xbegin = 3,
> +    .retry_trylock_internal_abort = 3,
> +  };
> +
> +struct tune
> +{
> +  const char *name;
> +  unsigned offset;
> +  int len;
> +};
> +
> +#define FIELD(x) { #x, offsetof(struct elision_config, x), sizeof(#x)-1 }
> +
> +static const struct tune tunings[] =
> +  {
> +    FIELD(retry_lock_busy),
> +    FIELD(retry_lock_internal_abort),
> +    FIELD(retry_try_xbegin),
> +    FIELD(retry_trylock_internal_abort),
> +    {}
> +  };
> +
> +#define PAIR(x) x, sizeof (x)-1
> +
> +static void
> +complain (const char *msg, int len)
> +{
> +  INTERNAL_SYSCALL_DECL (err);
> +  INTERNAL_SYSCALL (write, err, 3, 2, (char *)msg, len);
> +}
> +
> +static void
> +elision_aconf_setup(const char *s)
> +{
> +  int i;
> +
> +  while (*s)
> +    {
> +      for (i = 0; tunings[i].name; i++)
> +	{
> +	  int nlen = tunings[i].len;
> +
> +	  if (!strncmp (tunings[i].name, s, nlen))
> +	    {
> +	      char *end;
> +	      int val;
> +
> +	      if (s[nlen] != '=')
> +		{
> +  		  complain (PAIR("pthreads: invalid PTHREAD_MUTEX syntax: missing =\n"));
> +	 	  return;
> +		}
> +	      s += nlen + 1;
> +	      val = strtoul (s, &end, 0);
> +	      if (end == s)
> +		{
> +  		  complain (PAIR("pthreads: invalid PTHREAD_MUTEX syntax: missing number\n"));
> +	 	  return;
> +		}
> +	      *(int *)(((char *)&__elision_aconf) + tunings[i].offset) = val;
> +	      s = end;
> +	      if (*s == ',' || *s == ':')
> +		s++;
> +	      else if (*s)
> +		{
> +  		  complain (PAIR("pthreads: invalid PTHREAD_MUTEX syntax: garbage after number\n"));
> +	 	  return;
> +		}
> +	      break;
> +	    }
> +	}
> +      if (!tunings[i].name)
> +      	{
> +  	  complain (PAIR("pthreads: invalid PTHREAD_MUTEX syntax: unknown tunable\n"));
> + 	  return;
> +	}
> +    }
> +}
> +
> +int __rwlock_rtm_enabled attribute_hidden;
> +int __rwlock_rtm_read_retries attribute_hidden = 3;
> +int __elision_available attribute_hidden;
> +
> +#define PAIR(x) x, sizeof (x)-1
> +
> +static char *
> +next_env_entry (char first, char ***position)
> +{
> +  char **current = *position;
> +  char *result = NULL;
> +
> +  while (*current != NULL)
> +    {
> +      if ((*current)[0] == first)
> +	{
> +	  result = *current;
> +	  *position = ++current;
> +	  break;
> +	}
> +
> +      ++current;
> +    }
> +
> +  return result;
> +}
> +
> +static inline void
> +match (const char *line, const char *var, int len, const char **res)
> +{
> +  if (!strncmp (line, var, len))
> +    *res = line + len;
> +}
> +
> +static void
> +elision_mutex_init (const char *s)
> +{
> +  if (!s)
> +    return;
> +  if (!strncmp (s, "adaptive", 8) && (s[8] == 0 || s[8] == ':'))
> +    {
> +      __pthread_force_elision = __elision_available;
> +      if (s[8] == ':')
> +	elision_aconf_setup (s + 9);
> +    }
> +  else if (!strncmp (s, "elision", 7) && (s[7] == 0 || s[7] == ':'))
> +    {
> +      __pthread_force_elision = __elision_available;
> +      if (s[7] == ':')
> +        elision_aconf_setup (s + 8);
> +    }
> +  else if (!strncmp (s, "none", 4) && s[4] == 0)
> +    __pthread_force_elision = 0;
> +  else
> +    complain (PAIR("pthreads: Unknown setting for PTHREAD_MUTEX\n"));
> +}
> +
> +static void
> +elision_rwlock_init (const char *s)
> +{
> +  if (!s)
> +    {
> +      __rwlock_rtm_enabled = __elision_available;
> +      return;
> +    }
> +  if (!strncmp (s, "elision", 7))
> +    {
> +      __rwlock_rtm_enabled = __elision_available;
> +      if (s[7] == ':')
> +        {
> +          char *end;
> +	  int n;
> +
> +          n = strtoul (s + 8, &end, 0);
> +	  if (end == s + 8)
> +	    complain (PAIR("pthreads: Bad retry number for PTHREAD_RWLOCK\n"));
> +          else
> +	    __rwlock_rtm_read_retries = n;
> +	}
> +    }
> +  else if (!strncmp(s, "none", 4) && s[4] == 0)
> +    __rwlock_rtm_enabled = 0;
> +  else
> +    complain (PAIR("pthreads: Unknown setting for PTHREAD_RWLOCK\n"));
> +}
> +
> +static void
> +elision_init (int argc __attribute__ ((unused)),
> +	      char **argv  __attribute__ ((unused)),
> +	      char **environ)
> +{
> +  char *envline;
> +  const char *mutex = NULL, *rwlock = NULL;
> +
> +  __pthread_force_elision = 1;
> +  __elision_available = 1;
> +
> +  while ((envline = next_env_entry ('P', &environ)) != NULL)
> +    {
> +      match (envline, PAIR("PTHREAD_MUTEX="), &mutex);
> +      match (envline, PAIR("PTHREAD_RWLOCK="), &rwlock);

Please keep in mind that these are going to change.

We are currently working through another round of community
consensus over the use of environment variables and their
use in tuning runtime behaviour.

I'm wiring up a proposal to pass around and get consensus.

This is another warning that `PTHREAD_MUTEX' as an env var
is not likely to be accepted as the final value of the env
var.

> +    }
> +
> +  elision_mutex_init (mutex);
> +  elision_rwlock_init (rwlock);
> +}
> +
> +#ifdef SHARED
> +# define INIT_SECTION ".init_array"
> +#else
> +# define INIT_SECTION ".preinit_array"
> +#endif
> +
> +void (*const init_array []) (int, char **, char **)
> +  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) =
> +{
> +  &elision_init
> +};
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h
> new file mode 100644
> index 0000000..b9a9402
> --- /dev/null
> +++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h
> @@ -0,0 +1,52 @@
> +/* elision-conf.h: Lock elision tunable parameters.
> +   Copyright (C) 2013 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>. */
> +#ifndef _ELISION_CONF_H
> +#define _ELISION_CONF_H 1
> +
> +#include <pthread.h>
> +#include <cpuid.h>
> +#include <time.h>
> +
> +/* Should make sure there is no false sharing on this */
> +
> +struct elision_config
> +{
> +  int retry_lock_busy;
> +  int retry_lock_internal_abort;
> +  int retry_try_xbegin;
> +  int retry_trylock_internal_abort;
> +};
> +
> +extern struct elision_config __elision_aconf attribute_hidden;
> +
> +extern int __rwlock_rtm_enabled;
> +extern int __elision_available;
> +
> +extern int __pthread_mutex_timedlock_nortm (pthread_mutex_t *mutex, const struct timespec *);
> +extern int __pthread_mutex_timedlock_rtm (pthread_mutex_t *mutex, const struct timespec *);
> +extern int __pthread_mutex_timedlock (pthread_mutex_t *mutex, const struct timespec *);
> +extern int __pthread_mutex_lock_nortm (pthread_mutex_t *mutex);
> +extern int __pthread_mutex_lock_rtm (pthread_mutex_t *mutex);
> +extern int __pthread_mutex_lock (pthread_mutex_t *mutex);
> +extern int __pthread_mutex_trylock_nortm (pthread_mutex_t *);
> +extern int __pthread_mutex_trylock_rtm (pthread_mutex_t *);
> +extern int __pthread_mutex_trylock (pthread_mutex_t *);
> +
> +#define SUPPORTS_ELISION 1

Please use `USE_ELISION'.

We use HAVE_*, NEED_*, and USE_*.

Either HAVE_FOO to indicate a configure detected value e.g. HAVE_IFUNC.

Or NEED_FOO to indicate that the code being included needs FOO to work e.g. NEED_DL_SYSINFO.

Or lastly USE_FOO to indicate that FOO is available and should be used e.g. USE_TLS.

> +
> +#endif
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
> new file mode 100644
> index 0000000..a1e78b9
> --- /dev/null
> +++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
> @@ -0,0 +1,91 @@
> +/* elision-lock.c: Elided pthread mutex lock.
> +   Copyright (C) 2011, 2012, 2013 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>. */
> +#include <pthread.h>
> +#include "pthreadP.h"
> +#include "lowlevellock.h"
> +#include "hle.h"
> +#include "elision-conf.h"
> +
> +#if !defined(LLL_LOCK) && !defined(EXTRAARG)
> +/* Make sure the configuration code is always linked in for static
> +   libraries. */
> +#include "elision-conf.c"
> +#endif
> +
> +#ifndef EXTRAARG
> +#define EXTRAARG
> +#endif
> +#ifndef LLL_LOCK
> +#define LLL_LOCK(a,b) lll_lock(a,b), 0
> +#endif
> +
> +#define aconf __elision_aconf
> +
> +/* Adaptive lock using transactions.
> +   By default the lock region is run as a transaction, and when it
> +   aborts or the lock is busy the lock adapts itself. */
> +
> +int
> +__lll_lock_elision (int *futex, short *try_lock, EXTRAARG int private)
> +{
> +  if (*try_lock <= 0)
> +    {
> +      unsigned status;
> +      int try_xbegin;
> +
> +      for (try_xbegin = aconf.retry_try_xbegin;
> +	   try_xbegin > 0;
> +	   try_xbegin--)
> +	{
> +	  if ((status = _xbegin()) == _XBEGIN_STARTED)
> +	    {
> +	      if (*futex == 0)
> +		return 0;
> +
> +	      /* Lock was busy. Fall back to normal locking.
> +		 Could also _xend here but xabort with 0xff code
> +		 is more visible in the profiler. */
> +	      _xabort (_ABORT_LOCK_BUSY);
> +	    }
> +
> +	  if (!(status & _XABORT_RETRY))
> +	    {
> +	      if ((status & _XABORT_EXPLICIT) && _XABORT_CODE (status) == 0xff)
> +	        {
> +		  if (*try_lock != aconf.retry_lock_busy)
> +		    *try_lock = aconf.retry_lock_busy;
> +		}
> +	      /* Internal abort. There is no chance for retry.
> +		 Use the normal locking and next time use lock.
> +		 Be careful to avoid writing to the lock. */
> +	      else if (*try_lock != aconf.retry_lock_internal_abort)
> +		*try_lock = aconf.retry_lock_internal_abort;
> +	      break;
> +	    }
> +	}
> +    }
> +  else
> +    {
> +      /* Use a normal lock until the threshold counter runs out.
> +	 Lost updates possible. */
> +      (*try_lock)--;
> +    }
> +
> +  /* Use a normal lock here */
> +  return LLL_LOCK ((*futex), private);
> +}
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c
> new file mode 100644
> index 0000000..1cad4779
> --- /dev/null
> +++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c
> @@ -0,0 +1,8 @@
> +#include <time.h>
> +#include "elision-conf.h"
> +#include "lowlevellock.h"
> +#define __lll_lock_elision __lll_timedlock_elision
> +#define EXTRAARG const struct timespec *t,
> +#undef LLL_LOCK
> +#define LLL_LOCK(a, b) lll_timedlock(a, t, b)
> +#include "elision-lock.c"
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c
> new file mode 100644
> index 0000000..85de6e1
> --- /dev/null
> +++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c
> @@ -0,0 +1,70 @@
> +/* elision-trylock.c: Lock eliding trylock for pthreads.
> +   Copyright (C) 2013 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>. */
> +#include <pthread.h>
> +#include <pthreadP.h>
> +#include <lowlevellock.h>
> +#include "hle.h"
> +#include "elision-conf.h"
> +
> +#define aconf __elision_aconf
> +
> +/* Try to elide a futex trylock. FUTEX is the futex variable. TRY_LOCK is the
> +   adaptation counter in the mutex. UPGRADED is != 0 when this is for an
> +   automatically upgraded lock.  */
> +
> +int
> +__lll_trylock_elision (int *futex, short *try_lock, int upgraded)
> +{
> +  /* Only try a transaction if it's worth it */
> +  if (*try_lock <= 0)
> +    {
> +      unsigned status;
> +
> +      /* When this could be a nested trylock that is not explicitely
> +	 declared an elided lock abort. This makes us follow POSIX
> +	 paper semantics. */
> +      if (upgraded)
> +        _xabort (_ABORT_NESTED_TRYLOCK);
> +
> +      if ((status = _xbegin()) == _XBEGIN_STARTED)
> +	{
> +	  if (*futex == 0)
> +	    return 0;
> +
> +	  /* Lock was busy. Fall back to normal locking.
> +	     Could also _xend here but xabort with 0xff code
> +	     is more visible in the profiler. */
> +	  _xabort (_ABORT_LOCK_BUSY);
> +	}
> +
> +      if (!(status & _XABORT_RETRY))
> +        {
> +          /* Internal abort. No chance for retry. For future
> +             locks don't try speculation for some time. */
> +          if (*try_lock != aconf.retry_trylock_internal_abort)
> +            *try_lock = aconf.retry_trylock_internal_abort;
> +        }
> +    }
> +  else
> +    {
> +      /* Lost updates are possible, but harmless. */
> +      (*try_lock)--;
> +    }
> +
> +  return lll_trylock (*futex);
> +}
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c
> new file mode 100644
> index 0000000..0e74c8e
> --- /dev/null
> +++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c
> @@ -0,0 +1,32 @@
> +/* elision-unlock.c: Commit an elided pthread lock.
> +   Copyright (C) 2013 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +#include "pthreadP.h"
> +#include "lowlevellock.h"
> +#include "hle.h"
> +
> +int
> +__lll_unlock_elision(int *lock, int private)
> +{
> +  /* When the lock was free we're in a transaction.
> +     When you crash here you unlocked a free lock. */
> +  if (*lock == 0)
> +    _xend();
> +  else
> +    lll_unlock ((*lock), private);
> +  return 0;
> +}
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86/hle.h b/nptl/sysdeps/unix/sysv/linux/x86/hle.h
> new file mode 100644
> index 0000000..a08f0fa
> --- /dev/null
> +++ b/nptl/sysdeps/unix/sysv/linux/x86/hle.h
> @@ -0,0 +1,75 @@
> +/* Shared RTM header. Emulate TSX intrinsics for compilers and assemblers
> +   that do not support the intrinsics and instructions yet. */
> +#ifndef _HLE_H
> +#define _HLE_H 1
> +
> +#ifdef __ASSEMBLER__
> +
> +.macro XBEGIN target
> +	.byte 0xc7,0xf8
> +	.long \target-1f
> +1:
> +.endm
> +
> +.macro XEND
> +	.byte 0x0f,0x01,0xd5
> +.endm
> +
> +.macro XABORT code
> +	.byte 0xc6,0xf8,\code
> +.endm
> +
> +.macro XTEST
> +	 .byte 0x0f,0x01,0xd6
> +.endm
> +
> +#endif
> +
> +/* Official RTM intrinsics interface matching gcc/icc, but works
> +   on older gcc compatible compilers and binutils.
> +   We should somehow detect if the compiler supports it, because
> +   it may be able to generate slightly better code. */
> +
> +#define _XBEGIN_STARTED		(~0u)
> +#define _XABORT_EXPLICIT	(1 << 0)
> +#define _XABORT_RETRY		(1 << 1)
> +#define _XABORT_CONFLICT	(1 << 2)
> +#define _XABORT_CAPACITY	(1 << 3)
> +#define _XABORT_DEBUG		(1 << 4)
> +#define _XABORT_NESTED		(1 << 5)
> +#define _XABORT_CODE(x)		(((x) >> 24) & 0xff)
> +
> +#define _ABORT_LOCK_BUSY 	0xff
> +#define _ABORT_LOCK_IS_LOCKED	0xfe
> +#define _ABORT_NESTED_TRYLOCK	0xfd
> +
> +#ifndef __ASSEMBLER__
> +
> +#define __force_inline __attribute__((__always_inline__)) inline
> +
> +static __force_inline int _xbegin(void)
> +{
> +  int ret = _XBEGIN_STARTED;
> +  asm volatile (".byte 0xc7,0xf8 ; .long 0" : "+a" (ret) :: "memory");
> +  return ret;
> +}
> +
> +static __force_inline void _xend(void)
> +{
> +  asm volatile (".byte 0x0f,0x01,0xd5" ::: "memory");
> +}
> +
> +static __force_inline void _xabort(const unsigned int status)
> +{
> +  asm volatile (".byte 0xc6,0xf8,%P0" :: "i" (status) : "memory");
> +}
> +
> +static __force_inline int _xtest(void)
> +{
> +  unsigned char out;
> +  asm volatile (".byte 0x0f,0x01,0xd6 ; setnz %0" : "=r" (out) :: "memory");
> +  return out;
> +}
> +
> +#endif
> +#endif
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h b/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h
> index 6722294..98e7358 100644
> --- a/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h
> +++ b/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h
> @@ -426,6 +426,13 @@ LLL_STUB_UNWIND_INFO_END
>  		       : "memory", "cx", "cc", "r10", "r11");		      \
>       result; })
>  
> +extern int __lll_timedlock_elision (int *futex, short *try_lock,
> +					 const struct timespec *timeout,
> +					 int private) attribute_hidden;
> +
> +#define lll_timedlock_elision(futex, try_lock, timeout, private)	\
> +  __lll_timedlock_elision(&(futex), &(try_lock), timeout, private)
> +
>  #define lll_robust_timedlock(futex, timeout, id, private) \
>    ({ int result, ignore1, ignore2, ignore3;				      \
>       __asm __volatile (LOCK_INSTR "cmpxchgl %1, %4\n\t"			      \
> @@ -596,6 +603,22 @@ extern int __lll_timedwait_tid (int *tid, const struct timespec *abstime)
>        }									      \
>      __result; })
>  
> +extern int __lll_lock_elision (int *futex, short *try_lock, int private)
> +  attribute_hidden;
> +
> +extern int __lll_unlock_elision(int *lock, int private)
> +  attribute_hidden;
> +
> +extern int __lll_trylock_elision(int *lock, short *try_lock, int upgraded)
> +  attribute_hidden;
> +
> +#define lll_lock_elision(futex, try_lock, private) \
> +  __lll_lock_elision (&(futex), &(try_lock), private)
> +#define lll_unlock_elision(futex, private) \
> +  __lll_unlock_elision (&(futex), private)
> +#define lll_trylock_elision(futex, try_lock, upgraded) \
> +  __lll_trylock_elision(&(futex), &(try_lock), upgraded)
> +
>  #endif  /* !__ASSEMBLER__ */
>  
>  #endif	/* lowlevellock.h */
> 

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]