Bug 26716 - debuginfod: add yum-repo-URL as possible file source
Summary: debuginfod: add yum-repo-URL as possible file source
Status: NEW
Alias: None
Product: elfutils
Classification: Unclassified
Component: debuginfod (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-07 22:49 UTC by Frank Ch. Eigler
Modified: 2020-10-07 22:53 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Ch. Eigler 2020-10-07 22:49:21 UTC
<woodard@redhat.com> offered this suggestion:

Perhaps we could support another type of source file for debuginfod traversals:
yum repo databases on the web.  It would trade network traffic for local storage,
and probably be much slower.

debuginfod -R yum:https://download1.rpmfusion.org/free/fedora/releases/32/Everything/x86_64/debug

which would:
- (maybe support yum mirror metalink indirection)
- periodically download $url/repodata/repomd.xml
- parse it to locate and download the current active FOOBAR-primary.xml.gz
- for each package/location in the xml, check the package/time against the index
- if the package/location URL is unknown or fresher than in the database,
  - download the package
  - note its URL and process its archive contents in the database as usual
  - then throw away the archive (maybe subject to caching)

for queries:
- look up by the buildid as usual
- if the source file comes back as http* URL
  - download the package archive
  - extract the needed file from the archive & return it to the web client
  - then throw away the archive (maybe subject to caching)

for grooming:
- download all the repomd.xml's known to debuginfod
- from there, download the current complete URL package list for them all
- delete from the database any http urls that are not anywhere in that set
  (this would have the effect of forgetting prior version files)
alternately:
- send an http HEAD request for each package URL we know of
- any 404* type responses -> forget all related content from the database,
  as though source file were removed
  (this would cause many more outgoing requests, but keep older versions
  around as long as the upstream repo has them)


As a preparatory step, debuginfod could learn to access treat http* URLs
as a type of archive file.  Instead of opening directly via libarchive,
we'd libcurl-fetch the thing first, and feed the result to libarchive.
(Maybe, don't even assume it's an archive; just fetch it, then let the
existing debuginfod code dispatch, based on file name extension.)


see also: https://linux.die.net/man/8/createrepo