This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Remove atomic operations from malloc.c

On Wed, 2015-02-11 at 11:47 -0200, Adhemerval Zanella wrote:
> On 11-02-2015 11:29, Leonhard Holz wrote:
> >> I did get into the changes itself, but at least for powerpc (POWER8/16c/128T)
> >> I am not seeing improvements with the patch.  In fact it seems to increase
> >> contention:
> >>
> >>             time per iteration
> >> nths       master     patch
> >> 1           51.422    75.046
> >> 8          53.077    78.507
> >> 16         57.430    89.385
> >> 32         71.206   108.359
> >> 64        114.370   172.115
> >> 128       251.731   330.924
> >>
> >
> > Thank you for testing! Maybe the costs of a mutex_lock are higher on PowerPC than on i686? Anyway it looks like I have to take a different approach...

I don't think it's just that, but it could be a part.  When you use a
futex-based lock such as our lowlevellock, lock release needs an atomic
RMW operation as well (to find out whether there is any waiter).  That's
something that the (broken) list removal code doesn't need.

> PowerPC uses now the default implementation at sysdeps/nptl/lowlevellock.h which 
> basically translates to acquire CAS followed by a futex operation in contention
> case.  So I think the gain is for powerpc (specially with high SMT), busy-wait
> using like a spinlock yields better performance than possible issuing a futex
> operations.

Using spin-waiting should help, but I would be cautious in just using

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]