This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] New condvar implementation that provides stronger ordering guarantees.


On Fri, 2015-07-03 at 13:28 +0200, OndÅej BÃlka wrote:
> On Fri, Jul 03, 2015 at 11:03:25AM +0200, Torvald Riegel wrote:
> > On Thu, 2015-07-02 at 23:48 +0200, OndÅej BÃlka wrote:
> > > > > > > * If condvars use short critical sections (ie, hold the mutex just to
> > > > > > > check a binary flag or such), which they should do ideally, then forcing
> > > > > > > all those waiter to proceed serially with kernel-based hand-off (ie,
> > > > > > > futex ops in the mutex' contended state, via the futex wait queues) will
> > > > > > > be less efficient than just letting a scalable mutex implementation take
> > > > > > > care of it.  Our current mutex impl doesn't employ spinning at all, but
> > > > > > > if critical sections are short, spinning can be much better.
> > > > > 
> > > > > That looks too complicate code much, how do you want to pass information
> > > > > do differentiate between signal/broadcast?
> > > > 
> > > > I don't understand your question.  How does it relate to my paragraph
> > > > you replied to?
> > > >
> > > A problem is that mutex doesn't just protect short critical section, for
> > > example its C++ monitor with other method that locks for long time.
> > 
> > Yes, we may not just be dealing with short critical sections.  However,
> > if we have long critical sections on the waiter side, and broadcast or
> > many signals wake many waiters, then some potential slowdown due to
> > contention is less important because the program isn't scalable in the
> > first place (ie, has many long critical sections that need to be
> > serialized).  Specifically, one of the long critical sections will get
> > the lock; while it has it, the others will sort things out, in parallel
> > with the critical section that is running.  That means we may do some
> > extra work, but it won't slow down the critical section that has the
> > lock.
> >
> My argument was that a other method is from different thread but waiters
> could be fast. So it could be hard to distinguish between these.
> 
> A extra work matters if program used multiple threads effectively and
> could run computation instead of rescheduling threads that will only
> find lock locked. 

If a program relies on threads waiting to acquire a lock to give up the
compute resource so that it can run other stuff instead, the program
isn't using threads efficiently.  It's oversubscribing the machine, and
the context switches will matter more than a bit of contention.
 
> > we don't really do, even ADAPTIVE_MUTEX has a very simplistic spinning)
> > to not actually block via futexes (ie, the slow path).  So, this is more
> > about the lock implementation and the workload than about what the
> > condvar is doing; the existing condvar code tries to be nicer to the
> > existing lock implementation but, as I wrote, I don't think this is
> > really effective, prevents lock elision usage, and is likely better
> > addressed by improving the scalability of our lock implementation.
> > 
> > Does that answer your question?
> > 
> Not completely, it looks that no requeue will help but I wanted some
> data about workloads instead of relying on just intuition to see how
> should that be optimized. I mainly don't believe that spinning would be
> much of help as conditions for that are narrow.

The literature says otherwise, and many other real-world lock
implementations do spin for a while before blocking.  Just look at Java
locks, for example. 



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]