This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix incorrect double-checked locking in __nss_database_lookup. [BZ #20483]

On Tue, 2016-09-13 at 11:17 +0200, Florian Weimer wrote:
> On 09/13/2016 10:43 AM, Torvald Riegel wrote:
> > On Mon, 2016-09-12 at 20:55 +0200, Florian Weimer wrote:
> >> On 08/18/2016 10:49 PM, Torvald Riegel wrote:
> >>>  DB_LOOKUP_FCT (service_user **ni, const char *fct_name, const char *fct2_name,
> >>>  	       void **fctp)
> >>>  {
> >>> -      && __nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
> >>> +  /* Relaxed MO is fine because this is just about whether we have to perform
> >>> +     the lookup; we will do another acquire-MO load next before assuming that
> >>> +     the lookup has happened.  */
> >>> +  if ((atomic_load_relaxed (&(DATABASE_NAME_SYMBOL)) == NULL)
> >>> +      && (__nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
> >>>      return -1;
> >>>
> >>> -  *ni = DATABASE_NAME_SYMBOL;
> >>> +  /* Acquire MO as required by __nss_database_lookup.  */
> >>> +  *ni = atomic_load_acquire (&(DATABASE_NAME_SYMBOL));
> >>>
> >>>    return __nss_lookup (ni, fct_name, fct2_name, fctp);
> >>>  }
> >>
> >> I'm not sure if this is the double-checked locking pattern we want to
> >> use for new code.  GCC currently cannot merge the two loads (and maybe
> >> it never will do so because it is beneficial to treat the separate loads
> >> as an optimization hint).
> >
> > I don't think the (re)load matters performance-wise, given that
> > DATABASE_NAME_SYMBOL is initialized once, and so the access will most
> > likely hit in the cache.  What may matter were if we'd often returned -1
> > and would have an unnecessary acquire MO in this path, though OTOH
> > __nss_database_lookup will acquire the lock anyway.
> __nss_database_lookup fails only if there is a low-level failure (like a 
> failed memory allocation).  As far as I can see, syntax errors in the 
> configuration file are ignored.  You always get a positive result out of 
> it, so it's only called once.
> > Loading it into a temporary first, then doing the check, and then
> > issuing an acquire-MO fence would be another way to avoid the extra
> > load.
> Couldn't we do an acquire load unconditionally?  The == NULL case is 
> really not worth optimizing for, it only happens during startup.

We could, but because __nss_database_lookup doesn't return the pointer,
we'd still have to do an atomic_load_relaxed (...) at the caller.  So,
not much of a gain it seems.

> >> I still think we would be better off if we centralize this particular
> >> code in a single function with an interface similar to
> >> __nss_database_lookup.  The manual inlining seems unnecessary.
> >
> > Doing all of that in __nss_database_lookup might be cleaner, I agree.
> > But my focus for this patch was just a bug fix, not refactoring; I'd
> > like to leave it to others to decide how to clean-up this part of glibc.
> > I also wasn't quite sure whether __nss_database_lookup is part of the
> > ABI or not.
> I looked at the current implementation and I already see a *ni != NULL 
> in __nss_database_lookup, so I think we can just move the double-checked 
> locking into __nss_database_lookup and simplify the callers.

Same name, but that's a different ni :)  The ni in __nss_database_lookup
is the global variable;  the one in the callers is the private data the
callers use.

> This would 
> not result in an observable behavior change, so it is okay although 
> __nss_database_lookup is part of the ABI.

You can make this cleaner if __nss_database_lookup would return the
value from the exactly-once-initialization, and this is what I initially
did when preparing this patch.  But then I noticed that it looked to be
part of the ABI, and had to revert it.  (This was before PTO, so I'm
just paging this back in right now... ;) )

> And the refactoring would mostly be about deleting a bunch of code 
> around calls of __nss_database_lookup.  The core changes in 
> __nss_database_lookup (to move the release-MO store to the end, after 
> all initialization) are needed either way.

If we can assume that the callers are always in sync with the
__nss_database_lookup, then I'd agree -- but then this wouldn't be
really an ABI issue.  If the __nss_database_lookup implementation can be
older than the callers' changes, we need to be more careful.  I think I
looked into this and concluded that we couldn't improve the code much if
having to consider compatibility, but I don't remember the details of
why I thought that, sorry.

My suggestion would be to first apply this patch (if it's OK otherwise),
and then try to refactor.  We could put a comment into the first patch
highlighting that refactoring might be beneficial.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]