[PATCH] Fix incorrect double-checked locking in __nss_database_lookup. [BZ #20483]

Florian Weimer fweimer@redhat.com
Tue Sep 13 09:17:00 GMT 2016


On 09/13/2016 10:43 AM, Torvald Riegel wrote:
> On Mon, 2016-09-12 at 20:55 +0200, Florian Weimer wrote:
>> On 08/18/2016 10:49 PM, Torvald Riegel wrote:
>>>  DB_LOOKUP_FCT (service_user **ni, const char *fct_name, const char *fct2_name,
>>>  	       void **fctp)
>>>  {
>>> -  if (DATABASE_NAME_SYMBOL == NULL
>>> -      && __nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
>>> -				DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0)
>>> +  /* Relaxed MO is fine because this is just about whether we have to perform
>>> +     the lookup; we will do another acquire-MO load next before assuming that
>>> +     the lookup has happened.  */
>>> +  if ((atomic_load_relaxed (&(DATABASE_NAME_SYMBOL)) == NULL)
>>> +      && (__nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
>>> +				 DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0))
>>>      return -1;
>>>
>>> -  *ni = DATABASE_NAME_SYMBOL;
>>> +  /* Acquire MO as required by __nss_database_lookup.  */
>>> +  *ni = atomic_load_acquire (&(DATABASE_NAME_SYMBOL));
>>>
>>>    return __nss_lookup (ni, fct_name, fct2_name, fctp);
>>>  }
>>
>> I'm not sure if this is the double-checked locking pattern we want to
>> use for new code.  GCC currently cannot merge the two loads (and maybe
>> it never will do so because it is beneficial to treat the separate loads
>> as an optimization hint).
>
> I don't think the (re)load matters performance-wise, given that
> DATABASE_NAME_SYMBOL is initialized once, and so the access will most
> likely hit in the cache.  What may matter were if we'd often returned -1
> and would have an unnecessary acquire MO in this path, though OTOH
> __nss_database_lookup will acquire the lock anyway.

__nss_database_lookup fails only if there is a low-level failure (like a 
failed memory allocation).  As far as I can see, syntax errors in the 
configuration file are ignored.  You always get a positive result out of 
it, so it's only called once.

> Loading it into a temporary first, then doing the check, and then
> issuing an acquire-MO fence would be another way to avoid the extra
> load.

Couldn't we do an acquire load unconditionally?  The == NULL case is 
really not worth optimizing for, it only happens during startup.

>> I still think we would be better off if we centralize this particular
>> code in a single function with an interface similar to
>> __nss_database_lookup.  The manual inlining seems unnecessary.
>
> Doing all of that in __nss_database_lookup might be cleaner, I agree.
> But my focus for this patch was just a bug fix, not refactoring; I'd
> like to leave it to others to decide how to clean-up this part of glibc.
> I also wasn't quite sure whether __nss_database_lookup is part of the
> ABI or not.

I looked at the current implementation and I already see a *ni != NULL 
in __nss_database_lookup, so I think we can just move the double-checked 
locking into __nss_database_lookup and simplify the callers.  This would 
not result in an observable behavior change, so it is okay although 
__nss_database_lookup is part of the ABI.

And the refactoring would mostly be about deleting a bunch of code 
around calls of __nss_database_lookup.  The core changes in 
__nss_database_lookup (to move the release-MO store to the end, after 
all initialization) are needed either way.

Thanks,
Florian



More information about the Libc-alpha mailing list