In struct datahead, there are two fields that control validity of the cache entry, timeout and usable. These are not updated atomically at the same time, yet in prune_cache, we have this code: /* Check whether the entry timed out. */ if (dh->timeout < now) { … } else { assert (dh->usable); next_timeout = MIN (next_timeout, dh->timeout); } This assumes that if an entry has not timed out yet, it is always usable. The precise conditions under which entries become marked as not usable is still a bit mysterious to me. One such source of invalidation is the pass through the cache after start, to re-validate entries which have expired. If the date associated with a key has changed, the old entry is marked as not usable. I terminated nscd immediately after an entry was marked as unusable and set the clock backwards, so that the timeout on the entry had no longer lapsed (but not by an hour, so that nscd would still consider the cache file valid). This was sufficient to trigger the assertion failure during the next time nscd started. That's why I think we have a real bug here.
Patch posted: [PATCH] nscd: Skip unusable entries in first pass in prune_cache (bug 30800) <https://inbox.sourceware.org/libc-alpha/87o7iry6k6.fsf@oldenburg.str.redhat.com/T/#u>
Fixed for 2.39 via: commit c00b984fcd53f679ca2dafcd1aee2c89836e6e73 Author: Florian Weimer <fweimer@redhat.com> Date: Tue Aug 29 08:28:31 2023 +0200 nscd: Skip unusable entries in first pass in prune_cache (bug 30800) Previously, if an entry was marked unusable for any reason, but had not timed out yet, the assert would trigger. One way to get into such state is if a data change is detected during re-validation of an entry. This causes the entry to be marked as not usable. If exits nscd soon after that, then the clock jumps backwards, and nscd restarted, the cache re-validation run after startup triggers the removed assert. The change is more complicated than just the removal of the assert because entries marked as not usable should be garbage-collected in the second pass. To make this happen, it is necessary to update some book-keeping data. Reviewed-by: DJ Delorie <dj@redhat.com>