This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: gcc 4.1 implements compiler builtins for atomic ops

From: "David S. Miller" <davem at davemloft dot net>
To: drepper at redhat dot com
Cc: benh at kernel dot crashing dot org, libc-alpha at sources dot redhat dot com
Date: Sun, 26 Jun 2005 18:05:16 -0700 (PDT)
Subject: Re: gcc 4.1 implements compiler builtins for atomic ops
References: <42BF3C75.3040607@redhat.com><20050626.172706.59467428.davem@davemloft.net><42BF4D48.6070407@redhat.com>

From: Ulrich Drepper <drepper@redhat.com>
Date: Sun, 26 Jun 2005 17:50:16 -0700

> But the real problem with your argumentation is that there is no reason
> why the locking code should have a higher probability of defects in the
> processor than all the other parts combined.

>From working and talking with folks who have to deal with such
processor bugs, I come away with an opinion which differs from your's.

I've seen atomic operation bugs that resulted from any number of
problems.  For example, I know of one case where atomic operations
failed unless done within a single instruction cache line due to
a problem with a NUMA gateway implementation.  If an instruction
cache miss was generated during the atomic operation, you'd get
corruption in the memory the atomic operation was on.

No amount of microcode is going to fix bugs like that, yet a vDSO
page or library based implementation could handle that properly.
Especially, since you don't want GCC outputting every inline
atomic operation aligned to an I-cache line, calling out to a
function or similar is much more efficient in this case.

At the urging of another posting here, I read the GCC documentation on
the builtins.  And sadly, the GCC atomic builtin memory ordering
semantics are very suboptimal.  You don't need hard ordering if you
just want a counter to update atomically, and you don't care what
order other memory operations occur in wrt. that atomic operation.

Some processors eat a huge cost from the memory barriers, so avoiding
them for simple things such as an atomic counter used to collect
statistics or for reference counting is really needed.

This also applies to atomic operations on bitmaps and stuff like that.

We actually have a document in the Linux kernel which tries to
document precisely all of these cases and issues.  It's called
linux/Documentation/atomic_ops.txt

Follow-Ups:
- Re: gcc 4.1 implements compiler builtins for atomic ops
  - From: Benjamin Herrenschmidt

References:
- Re: gcc 4.1 implements compiler builtins for atomic ops
  - From: Ulrich Drepper
- Re: gcc 4.1 implements compiler builtins for atomic ops
  - From: David S. Miller
- Re: gcc 4.1 implements compiler builtins for atomic ops
  - From: Ulrich Drepper

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]