This is the mail archive of the
elfutils-devel@sourceware.org
mailing list for the elfutils project.
Re: PR25394 patch: debuginfod groom vs. scan race condition
Hi -
> Nice work. And I think I almost understand it. I am just a little
> confused about thread_main_fts_source_paths also being an "idler":
Yup, it's a bit subtle.
> > while (! interrupted)
> > {
> > - set_metric("thread_timer", "role","traverse", rescan_timer);
> > - // set_metric("thread_forced_total", "role","traverse", forced_rescan_count);
> > - if (rescan_s && rescan_timer > rescan_s)
> > - rescan_timer = 0;
> > + sleep (1);
> > + scanq.wait_idle(); // don't start a new traversal while scanners haven't finished the job
> > + scanq.done_idle(); // release the hounds
> > + if (interrupted) break;
> I guess the comment "release the hounds" gets me confused. I don't
> know precisely what that means.
https://www.youtube.com/watch?time_continue=28&v=ORvkElqw0QM
> Is this needed for correctness, or is it simply to have an hint that
> the scanners have ran because you register as an idler and so at
> this point you could only run if no scanner is running?
It's because we don't want to start a new fts traversal pass while the
scanners haven't even finished the last one. I've seen a case where
the rescan interval was so short that the scanner backlog just kept
growing and growing and growing. (Changing the work queue to a set
from a dequeue also helps with duplicate elimination.) So in this
way, it just returns to the timing behaviour of the pre-workqueue
implementation: we wait a certain amount of time AFTER the scanning is
ALL DONE, before starting a new scan.
> But given you immediately tell them you are done a scanner could in
> theory start their work up again while you are doing your own
> traversal?
Yes, because that's ideal for concurrency: while one thread fts(3)'s
and enumerates the results, the scanner threads can process the
already enumerated files.
- FChE