RFCv2: debuginfod debian archive support

Frank Ch. Eigler fche@redhat.com
Mon Dec 2 22:54:00 GMT 2019


Hi -

Presenting v2 of the debuginfod .deb/.ddeb support patch.

On second thought, generalized the code & terminology.  This may be
ready for merging, except that it'd be awesome if a
debian/ubuntu-literate person could create some test .deb/.ddeb files
matching the tests/debuginfod-rpms.  (cc:'d some maintainers there in
hope they might have the time.)  Hand-testing here looks okay.

If anyone knows of a distro or ISV who is using standard .zip or .tar
or somesuch archive formats as their distribution mechanism, it would
be a tiny effort more to add another option, say "-A" (any archive),
because libarchive can recognize and unpack a large variety of
formats.  WDYT?

This patch is also on the elfutils.git fche/debuginfod-deb branch.


Author: Frank Ch. Eigler <fche@redhat.com>
Date:   Sat Nov 30 10:46:44 2019 -0500

    debuginfod server: support .deb/.ddeb archives
    
    Add support for scanning .deb / .ddeb files, enabled with a new
    command line option "-U".
    
    Signed-off-by: Frank Ch. Eigler <fche@redhat.com>

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 8aa2944..1731dca 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,10 @@
+2019-12-02  Frank Ch. Eigler  <fche@redhat.com>
+
+	* debuginfod.cxx (*_rpm_*): Rename to *_archive_* throughout.
+	(scan_archives): New read-mostly global to identify archive
+	file extensions and corresponding extractor commands.
+	(parse_opt): Handle new -U flag.
+
 2019-11-26  Mark Wielaard  <mark@klomp.org>
 
 	* Makefile.am (BUILD_STATIC): Add needed libraries for libdw and
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index aa7ffcf..e96a4dd 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -106,6 +106,15 @@ using namespace std;
 #endif
 
 
+inline bool
+string_endswith(const string& haystack, const string& needle)
+{
+  return (haystack.size() >= needle.size() &&
+	  equal(haystack.end()-needle.size(), haystack.end(),
+                needle.begin()));
+}
+
+
 // Roll this identifier for every sqlite schema incompatiblity.
 #define BUILDIDS "buildids9"
 
@@ -231,9 +240,9 @@ static const char DEBUGINFOD_SQLITE_DDL[] =
   "create view if not exists " BUILDIDS "_stats as\n"
   "          select 'file d/e' as label,count(*) as quantity from " BUILDIDS "_f_de\n"
   "union all select 'file s',count(*) from " BUILDIDS "_f_s\n"
-  "union all select 'rpm d/e',count(*) from " BUILDIDS "_r_de\n"
-  "union all select 'rpm sref',count(*) from " BUILDIDS "_r_sref\n"
-  "union all select 'rpm sdef',count(*) from " BUILDIDS "_r_sdef\n"
+  "union all select 'archive d/e',count(*) from " BUILDIDS "_r_de\n"
+  "union all select 'archive sref',count(*) from " BUILDIDS "_r_sref\n"
+  "union all select 'archive sdef',count(*) from " BUILDIDS "_r_sdef\n"
   "union all select 'buildids',count(*) from " BUILDIDS "_buildids\n"
   "union all select 'filenames',count(*) from " BUILDIDS "_files\n"
   "union all select 'files scanned (#)',count(*) from " BUILDIDS "_file_mtime_scanned\n"
@@ -324,6 +333,7 @@ static const struct argp_option options[] =
    { NULL, 0, NULL, 0, "Scanners:", 1 },
    { "scan-file-dir", 'F', NULL, 0, "Enable ELF/DWARF file scanning threads.", 0 },
    { "scan-rpm-dir", 'R', NULL, 0, "Enable RPM scanning threads.", 0 },
+   { "scan-deb-dir", 'U', NULL, 0, "Enable DEB scanning threads.", 0 },   
    // "source-oci-imageregistry"  ... 
 
    { NULL, 0, NULL, 0, "Options:", 2 },
@@ -371,7 +381,7 @@ static unsigned maxigroom = false;
 static unsigned concurrency = std::thread::hardware_concurrency() ?: 1;
 static set<string> source_paths;
 static bool scan_files = false;
-static bool scan_rpms = false;
+static map<string,string> scan_archives;
 static vector<string> extra_ddl;
 static regex_t file_include_regex;
 static regex_t file_exclude_regex;
@@ -402,7 +412,13 @@ parse_opt (int key, char *arg,
       if (http_port > 65535) argp_failure(state, 1, EINVAL, "port number");
       break;
     case 'F': scan_files = true; break;
-    case 'R': scan_rpms = true; break;
+    case 'R':
+      scan_archives[".rpm"]="rpm2cpio";
+      break;
+    case 'U':
+      scan_archives[".deb"]="dpkg-deb --fsys-tarfile";
+      scan_archives[".ddeb"]="dpkg-deb --fsys-tarfile";
+      break;
     case 'L':
       traverse_logical = true;
       break;
@@ -851,7 +867,11 @@ handle_buildid_r_match (int64_t b_mtime,
       return 0;
     }
 
-  string popen_cmd = string("rpm2cpio " + shell_escape(b_source0));
+  string archive_decoder = "/dev/null";
+  for (auto&& arch : scan_archives)
+    if (string_endswith(b_source0, arch.first))
+      archive_decoder = arch.second;
+  string popen_cmd = archive_decoder + " " + shell_escape(b_source0);
   FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
   if (fp == NULL)
     throw libc_exception (errno, string("popen ") + popen_cmd);
@@ -863,9 +883,9 @@ handle_buildid_r_match (int64_t b_mtime,
     throw archive_exception("cannot create archive reader");
   defer_dtor<struct archive*,int> archive_closer (a, archive_read_free);
 
-  rc = archive_read_support_format_cpio(a);
+  rc = archive_read_support_format_all(a);
   if (rc != ARCHIVE_OK)
-    throw archive_exception(a, "cannot select cpio format");
+    throw archive_exception(a, "cannot select all format");
   rc = archive_read_support_filter_all(a);
   if (rc != ARCHIVE_OK)
     throw archive_exception(a, "cannot select all filters");
@@ -902,7 +922,7 @@ handle_buildid_r_match (int64_t b_mtime,
           throw archive_exception(a, "cannot extract file");
         }
 
-      inc_metric ("http_responses_total","result","rpm");
+      inc_metric ("http_responses_total","result","archive");
       struct MHD_Response* r = MHD_create_response_from_fd (archive_entry_size(e), fd);
       if (r == 0)
         {
@@ -916,7 +936,7 @@ handle_buildid_r_match (int64_t b_mtime,
           MHD_add_response_header (r, "Content-Type", "application/octet-stream");
           add_mhd_last_modified (r, archive_entry_mtime(e));
           if (verbose > 1)
-            obatched(clog) << "serving rpm " << b_source0 << " file " << b_source1 << endl;
+            obatched(clog) << "serving archive " << b_source0 << " file " << b_source1 << endl;
           /* libmicrohttpd will close it. */
           if (result_fd)
             *result_fd = fd;
@@ -1858,16 +1878,20 @@ thread_main_scan_source_file_path (void* arg)
 
 
 
-// Analyze given *.rpm file of given age; record buildids / exec/debuginfo-ness of its
+// Analyze given archive file of given age; record buildids / exec/debuginfo-ness of its
 // constituent files with given upsert statements.
 static void
-rpm_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
+archive_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
               sqlite_ps& ps_upsert_de, sqlite_ps& ps_upsert_sref, sqlite_ps& ps_upsert_sdef,
               time_t mtime,
               unsigned& fts_executable, unsigned& fts_debuginfo, unsigned& fts_sref, unsigned& fts_sdef,
               bool& fts_sref_complete_p)
 {
-  string popen_cmd = string("rpm2cpio " + shell_escape(rps));
+  string archive_decoder = "/dev/null";
+  for (auto&& arch : scan_archives)
+    if (string_endswith(rps, arch.first))
+      archive_decoder = arch.second;
+  string popen_cmd = archive_decoder + " " + shell_escape(rps);
   FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
   if (fp == NULL)
     throw libc_exception (errno, string("popen ") + popen_cmd);
@@ -1879,9 +1903,9 @@ rpm_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_up
     throw archive_exception("cannot create archive reader");
   defer_dtor<struct archive*,int> archive_closer (a, archive_read_free);
 
-  int rc = archive_read_support_format_cpio(a);
+  int rc = archive_read_support_format_all(a);
   if (rc != ARCHIVE_OK)
-    throw archive_exception(a, "cannot select cpio format");
+    throw archive_exception(a, "cannot select all formats");
   rc = archive_read_support_filter_all(a);
   if (rc != ARCHIVE_OK)
     throw archive_exception(a, "cannot select all filters");
@@ -2027,11 +2051,11 @@ rpm_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_up
 
 
 
-// scan for *.rpm files
+// scan for archive files such as .rpm
 static void
-scan_source_rpm_path (const string& dir)
+scan_source_archive_path (const string& dir)
 {
-  obatched(clog) << "fts/rpm traversing " << dir << endl;
+  obatched(clog) << "fts/archive traversing " << dir << endl;
 
   sqlite_ps ps_upsert_buildids (db, "rpm-buildid-intern", "insert or ignore into " BUILDIDS "_buildids VALUES (NULL, ?);");
   sqlite_ps ps_upsert_files (db, "rpm-file-intern", "insert or ignore into " BUILDIDS "_files VALUES (NULL, ?);");
@@ -2060,7 +2084,7 @@ scan_source_rpm_path (const string& dir)
   struct timeval tv_start, tv_end;
   gettimeofday (&tv_start, NULL);
   unsigned fts_scanned=0, fts_regex=0, fts_cached=0, fts_debuginfo=0;
-  unsigned fts_executable=0, fts_rpm = 0, fts_sref=0, fts_sdef=0;
+  unsigned fts_executable=0, fts_archive = 0, fts_sref=0, fts_sdef=0;
 
   FTS *fts = fts_open (dirs,
                        (traverse_logical ? FTS_LOGICAL : FTS_PHYSICAL|FTS_XDEV)
@@ -2082,7 +2106,7 @@ scan_source_rpm_path (const string& dir)
         break;
 
       if (verbose > 2)
-        obatched(clog) << "fts/rpm traversing " << f->fts_path << endl;
+        obatched(clog) << "fts/archive traversing " << f->fts_path << endl;
 
       try
         {
@@ -2101,7 +2125,7 @@ scan_source_rpm_path (const string& dir)
           if (!ri || rx)
             {
               if (verbose > 3)
-                obatched(clog) << "fts/rpm skipped by regex " << (!ri ? "I" : "") << (rx ? "X" : "") << endl;
+                obatched(clog) << "fts/archive skipped by regex " << (!ri ? "I" : "") << (rx ? "X" : "") << endl;
               fts_regex ++;
               continue;
             }
@@ -2116,13 +2140,13 @@ scan_source_rpm_path (const string& dir)
 
             case FTS_F:
               {
-                // heuristic: reject if file name does not end with ".rpm"
-                // (alternative: try opening with librpm etc., caching)
-                string suffix = ".rpm";
-                if (rps.size() < suffix.size() ||
-                    rps.substr(rps.size()-suffix.size()) != suffix)
+		bool any = false;
+                for (auto&& arch : scan_archives)
+                  if (string_endswith(rps, arch.first))
+		    any = true;
+		if (! any)
                   continue;
-                fts_rpm ++;
+                fts_archive ++;
 
                 /* See if we know of it already. */
                 int rc = ps_query
@@ -2133,7 +2157,7 @@ scan_source_rpm_path (const string& dir)
                 ps_query.reset();
                 if (rc == SQLITE_ROW) // i.e., a result, as opposed to DONE (no results)
                   // no need to recheck a file/version we already know
-                  // specifically, no need to parse this rpm again, since we already have
+                  // specifically, no need to parse this archive again, since we already have
                   // it as a D or E or S record,
                   // (so is stored with buildid=NULL)
                   {
@@ -2141,29 +2165,29 @@ scan_source_rpm_path (const string& dir)
                     continue;
                   }
 
-                // intern the rpm file name
+                // intern the archive file name
                 ps_upsert_files
                   .reset()
                   .bind(1, rps)
                   .step_ok_done();
 
-                // extract the rpm contents via popen("rpm2cpio") | libarchive | loop-of-elf_classify()
+                // extract the archive contents
                 unsigned my_fts_executable = 0, my_fts_debuginfo = 0, my_fts_sref = 0, my_fts_sdef = 0;
                 bool my_fts_sref_complete_p = true;
                 try
                   {
-                    rpm_classify (rps,
+                    archive_classify (rps,
                                   ps_upsert_buildids, ps_upsert_files,
                                   ps_upsert_de, ps_upsert_sref, ps_upsert_sdef, // dalt
                                   f->fts_statp->st_mtime,
                                   my_fts_executable, my_fts_debuginfo, my_fts_sref, my_fts_sdef,
                                   my_fts_sref_complete_p);
-                    inc_metric ("scanned_total","source","rpm");
-                    add_metric("found_debuginfo_total","source","rpm",
+                    inc_metric ("scanned_total","source","archive");
+                    add_metric("found_debuginfo_total","source","archive",
                                my_fts_debuginfo);
-                    add_metric("found_executable_total","source","rpm",
+                    add_metric("found_executable_total","source","archive",
                                my_fts_executable);
-                    add_metric("found_sourcerefs_total","source","rpm",
+                    add_metric("found_sourcerefs_total","source","archive",
                                my_fts_sref);
                   }
                 catch (const reportable_exception& e)
@@ -2172,7 +2196,7 @@ scan_source_rpm_path (const string& dir)
                   }
 
                 if (verbose > 2)
-                  obatched(clog) << "scanned rpm=" << rps
+                  obatched(clog) << "scanned archive=" << rps
                                  << " mtime=" << f->fts_statp->st_mtime
                                  << " executables=" << my_fts_executable
                                  << " debuginfos=" << my_fts_debuginfo
@@ -2197,7 +2221,7 @@ scan_source_rpm_path (const string& dir)
 
             case FTS_ERR:
             case FTS_NS:
-              throw libc_exception(f->fts_errno, string("fts/rpm traversal ") + string(f->fts_path));
+              throw libc_exception(f->fts_errno, string("fts/archive traversal ") + string(f->fts_path));
 
             default:
             case FTS_SL: /* ignore symlinks; seen in non-L mode only */
@@ -2206,9 +2230,9 @@ scan_source_rpm_path (const string& dir)
 
           if ((verbose && f->fts_info == FTS_DP) ||
               (verbose > 1 && f->fts_info == FTS_F))
-            obatched(clog) << "fts/rpm traversing " << rps << ", scanned=" << fts_scanned
+            obatched(clog) << "fts/archive traversing " << rps << ", scanned=" << fts_scanned
                            << ", regex-skipped=" << fts_regex
-                           << ", rpm=" << fts_rpm << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
+                           << ", archive=" << fts_archive << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
                            << ", executable=" << fts_executable
                            << ", sourcerefs=" << fts_sref << ", sourcedefs=" << fts_sdef << endl;
         }
@@ -2222,9 +2246,9 @@ scan_source_rpm_path (const string& dir)
   gettimeofday (&tv_end, NULL);
   double deltas = (tv_end.tv_sec - tv_start.tv_sec) + (tv_end.tv_usec - tv_start.tv_usec)*0.000001;
 
-  obatched(clog) << "fts/rpm traversed " << dir << " in " << deltas << "s, scanned=" << fts_scanned
+  obatched(clog) << "fts/archive traversed " << dir << " in " << deltas << "s, scanned=" << fts_scanned
                  << ", regex-skipped=" << fts_regex
-                 << ", rpm=" << fts_rpm << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
+                 << ", archive=" << fts_archive << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
                  << ", executable=" << fts_executable
                  << ", sourcerefs=" << fts_sref << ", sourcedefs=" << fts_sdef << endl;
 }
@@ -2232,18 +2256,18 @@ scan_source_rpm_path (const string& dir)
 
 
 static void*
-thread_main_scan_source_rpm_path (void* arg)
+thread_main_scan_source_archive_path (void* arg)
 {
   string dir = string((const char*) arg);
 
   unsigned rescan_timer = 0;
   sig_atomic_t forced_rescan_count = 0;
-  set_metric("thread_timer_max", "rpm", dir, rescan_s);
-  set_metric("thread_tid", "rpm", dir, tid());
+  set_metric("thread_timer_max", "archive", dir, rescan_s);
+  set_metric("thread_tid", "archive", dir, tid());
   while (! interrupted)
     {
-      set_metric("thread_timer", "rpm", dir, rescan_timer);
-      set_metric("thread_forced_total", "rpm", dir, forced_rescan_count);
+      set_metric("thread_timer", "archive", dir, rescan_timer);
+      set_metric("thread_forced_total", "archive", dir, forced_rescan_count);
       if (rescan_s && rescan_timer > rescan_s)
         rescan_timer = 0;
       if (sigusr1 != forced_rescan_count)
@@ -2254,10 +2278,10 @@ thread_main_scan_source_rpm_path (void* arg)
       if (rescan_timer == 0)
         try
           {
-            set_metric("thread_working", "rpm", dir, time(NULL));
-            inc_metric("thread_work_total", "rpm", dir);
-            scan_source_rpm_path (dir);
-            set_metric("thread_working", "rpm", dir, 0);
+            set_metric("thread_working", "archive", dir, time(NULL));
+            inc_metric("thread_work_total", "archive", dir);
+            scan_source_archive_path (dir);
+            set_metric("thread_working", "archive", dir, 0);
           }
         catch (const sqlite_exception& e)
           {
@@ -2490,8 +2514,8 @@ main (int argc, char *argv[])
       error (EXIT_FAILURE, 0,
              "unexpected argument: %s", argv[remaining]);
 
-  if (!scan_rpms && !scan_files && source_paths.size()>0)
-    obatched(clog) << "warning: without -F and/or -R, ignoring PATHs" << endl;
+  if (scan_archives.size()==0 && !scan_files && source_paths.size()>0)
+    obatched(clog) << "warning: without -F -R -U, ignoring PATHs" << endl;
 
   (void) signal (SIGPIPE, SIG_IGN); // microhttpd can generate it incidentally, ignore
   (void) signal (SIGINT, signal_handler); // ^C
@@ -2611,16 +2635,22 @@ main (int argc, char *argv[])
   if (maxigroom)
     obatched(clog) << "maxigroomed database" << endl;
 
-
   obatched(clog) << "search concurrency " << concurrency << endl;
   obatched(clog) << "rescan time " << rescan_s << endl;
   obatched(clog) << "groom time " << groom_s << endl;
+  if (scan_archives.size()>0)
+    {
+      obatched ob(clog);
+      auto& o = ob << "scanning archive types ";
+      for (auto&& arch : scan_archives)
+	o << arch.first << " ";
+      o << endl;
+    }
   const char* du = getenv(DEBUGINFOD_URLS_ENV_VAR);
   if (du && du[0] != '\0') // set to non-empty string?
     obatched(clog) << "upstream debuginfod servers: " << du << endl;
 
-  vector<pthread_t> source_file_scanner_threads;
-  vector<pthread_t> source_rpm_scanner_threads;
+  vector<pthread_t> scanner_threads;
   pthread_t groom_thread;
 
   rc = pthread_create (& groom_thread, NULL, thread_main_groom, NULL);
@@ -2632,20 +2662,21 @@ main (int argc, char *argv[])
       pthread_t pt;
       rc = pthread_create (& pt, NULL, thread_main_scan_source_file_path, (void*) it.c_str());
       if (rc < 0)
-        error (0, 0, "warning: cannot spawn thread (%d) to scan source files %s\n", rc, it.c_str());
+        error (0, 0, "warning: cannot spawn thread (%d) to scan files %s\n", rc, it.c_str());
       else
-        source_file_scanner_threads.push_back(pt);
+        scanner_threads.push_back(pt);
     }
 
-  if (scan_rpms) for (auto&& it : source_paths)
-    {
-      pthread_t pt;
-      rc = pthread_create (& pt, NULL, thread_main_scan_source_rpm_path, (void*) it.c_str());
-      if (rc < 0)
-        error (0, 0, "warning: cannot spawn thread (%d) to scan source rpms %s\n", rc, it.c_str());
-      else
-        source_rpm_scanner_threads.push_back(pt);
-    }
+  if (scan_archives.size() > 0)
+    for (auto&& it : source_paths)
+      {
+        pthread_t pt;
+        rc = pthread_create (& pt, NULL, thread_main_scan_source_archive_path, (void*) it.c_str());
+        if (rc < 0)
+          error (0, 0, "warning: cannot spawn thread (%d) to scan archives %s\n", rc, it.c_str());
+        else
+          scanner_threads.push_back(pt);
+      }
 
   /* Trivial main loop! */
   set_metric("ready", 1);
@@ -2656,10 +2687,8 @@ main (int argc, char *argv[])
   if (verbose)
     obatched(clog) << "stopping" << endl;
 
-  /* Join any source scanning threads. */
-  for (auto&& it : source_file_scanner_threads)
-    pthread_join (it, NULL);
-  for (auto&& it : source_rpm_scanner_threads)
+  /* Join all our threads. */
+  for (auto&& it : scanner_threads)
     pthread_join (it, NULL);
   pthread_join (groom_thread, NULL);
   
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 00a61ac..398048f 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,12 @@
+2019-12-02  Frank Ch. Eigler  <fche@redhat.com
+
+	* debuginfod.8: Add -U (DEB) flag, generalize RPM to "archive".
+
+2019-11-26  Frank Ch. Eigler  <fche@redhat.com>
+	    Aaron Merey  <amerey@redhat.com>
+
+	* debuginfod.8, find-debuginfo.1, debuginfod_*.3: New files.
+
 2019-09-02  Mark Wielaard  <mark@klomp.org>
 
 	* readelf.1 (symbols): Add optional section name.
diff --git a/doc/debuginfod.8 b/doc/debuginfod.8
index 210550e..6e078f6 100644
--- a/doc/debuginfod.8
+++ b/doc/debuginfod.8
@@ -24,7 +24,7 @@ debuginfod \- debuginfo-related http file-server daemon
 .SH DESCRIPTION
 \fBdebuginfod\fP serves debuginfo-related artifacts over HTTP.  It
 periodically scans a set of directories for ELF/DWARF files and their
-associated source code, as well as RPM files containing the above, to
+associated source code, as well as archive files containing the above, to
 build an index by their buildid.  This index is used when remote
 clients use the HTTP webapi, to fetch these files by the same buildid.
 
@@ -55,17 +55,22 @@ or even use debuginfod itself:
 ^C
 .ESAMPLE
 
-If the \fB\-R\fP option is given each listed PATH creates a thread to
-scan for ELF/DWARF/source files contained in matching RPMs under the
-given physical directory.  Duplicate directories are ignored.  You may
-use a file name for a PATH, but source code indexing may be
-incomplete; prefer using a directory that contains normal RPMs
-alongside debuginfo/debugsource RPMs.  Because of complications such
-as DWZ-compressed debuginfo, may require \fItwo\fP scan passes to
-identify all source code.  Source files for RPMs are only served
-from other RPMs, so the caution for \-F does not apply.
-
-If no PATH is listed, or neither \-F nor \-R option is given, then
+If the \fB\-R\fP and/or \fB-U\fP option is given, each listed PATH
+creates a thread to scan for ELF/DWARF/source files contained in
+archive files.  If \-R is given, the will scan RPMs; and/or if \-U is
+given, they will scan DEB / DDEB files.  (The terms RPM and DEB and
+DDEB are used synonymously as "archives" in diagnostic messages.)
+Duplicate directories are ignored.  You may use a file name for a
+PATH, but source code indexing may be incomplete.  Instead, use a
+directory that contains normal RPMs alongside debuginfo/debugsource
+RPMs.  Because of complications such as DWZ-compressed debuginfo, may
+require \fItwo\fP scan passes to identify all source code.  Source
+files for RPMs are only served from other RPMs, so the caution for \-F
+does not apply.  Note that due to Debian/Ubuntu packaging policies &
+mechanisms, debuginfod cannot resolve source files for DEB/DDEB at
+all.
+
+If no PATH is listed, or neither \-F nor \-R nor \-U option is given, then
 \fBdebuginfod\fP will simply serve content that it scanned into its
 index in previous runs: the data is cumulative.
 
@@ -81,7 +86,11 @@ Activate ELF/DWARF file scanning threads.  The default is off.
 
 .TP
 .B "\-R"
-Activate RPM file scanning threads.  The default is off.
+Activate RPM patterns in archive scanning threads.  The default is off.
+
+.TP
+.B "\-U"
+Activate DEB/DDEB patterns in archive scanning threads.  The default is off.
 
 .TP
 .B "\-d FILE" "\-\-database=FILE"
@@ -114,12 +123,12 @@ extended REs, thus may include alternation.  They are evaluated
 against the full path of each file, based on its \fBrealpath(3)\fP
 canonicalization.  By default, all files are included and none are
 excluded.  A file that matches both include and exclude REGEX is
-excluded.  (The \fIcontents\fP of RPM files are not subject to
+excluded.  (The \fIcontents\fP of archive files are not subject to
 inclusion or exclusion filtering: they are all processed.)
 
 .TP
 .B "\-t SECONDS"  "\-\-rescan\-time=SECONDS"
-Set the rescan time for the file and RPM directories.  This is the
+Set the rescan time for the file and archive directories.  This is the
 amount of time the scanning threads will wait after finishing a scan,
 before doing it again.  A rescan for unchanged files is fast (because
 the index also stores the file mtimes).  A time of zero is acceptable,
@@ -143,8 +152,8 @@ independent of the groom time (including if it was zero).
 .B "\-G"
 Run an extraordinary maximal-grooming pass at debuginfod startup.
 This pass can take considerable time, because it tries to remove any
-debuginfo-unrelated content from the RPM-related parts of the index.
-It should not be run if any recent RPM-related indexing operations
+debuginfo-unrelated content from the archive-related parts of the index.
+It should not be run if any recent archive-related indexing operations
 were aborted early.  It can take considerable space, because it
 finishes up with an sqlite "vacuum" operation, which repacks the
 database file by triplicating it temporarily.  The default is not to
@@ -155,7 +164,7 @@ do maximal-grooming.  See also the \fIDATA MANAGEMENT\fP section.
 Set the concurrency limit for all the scanning threads.  While many
 threads may be spawned to cover all the given PATHs, only NUM may
 concurrently do CPU-intensive operations like parsing an ELF file
-or an RPM.  The default is the number of processors on the system;
+or an archive.  The default is the number of processors on the system;
 the minimum is 1.
 
 .TP
@@ -257,10 +266,10 @@ many files.  This section offers some advice about the implications.
 
 As a general explanation for size, consider that debuginfod indexes
 ELF/DWARF files, it stores their names and referenced source file
-names, and buildids will be stored.  When indexing RPMs, it stores
-every file name \fIof or in\fP an RPM, every buildid, plus every
-source file name referenced from a DWARF file.  (Indexing RPMs takes
-more space because the source files often reside in separate
+names, and buildids will be stored.  When indexing archives, it stores
+every file name \fIof or in\fP an archive, every buildid, plus every
+source file name referenced from a DWARF file.  (Indexing archives
+takes more space because the source files often reside in separate
 subpackages that may not be indexed at the same pass, so extra
 metadata has to be kept.)
 
@@ -283,14 +292,14 @@ This means that the sqlite files grow fast during initial indexing,
 slowly during index rescans, and periodically shrink during grooming.
 There is also an optional one-shot \fImaximal grooming\fP pass is
 available.  It removes information debuginfo-unrelated data from the
-RPM content index such as file names found in RPMs ("rpm sdef"
-records) that are not referred to as source files from any binaries
-find in RPMs ("rpm sref" records).  This can save considerable disk
-space.  However, it is slow and temporarily requires up to twice the
-database size as free space.  Worse: it may result in missing
-source-code info if the RPM traversals were interrupted, so the not
-all source file references were known.  Use it rarely to polish a
-complete index.
+archive content index such as file names found in archives ("archive
+sdef" records) that are not referred to as source files from any
+binaries find in archives ("archive sref" records).  This can save
+considerable disk space.  However, it is slow and temporarily requires
+up to twice the database size as free space.  Worse: it may result in
+missing source-code info if the archive traversals were interrupted,
+so the not all source file references were known.  Use it rarely to
+polish a complete index.
 
 You should ensure that ample disk space remains available.  (The flood
 of error messages on -ENOSPC is ugly and nagging.  But, like for most
@@ -317,7 +326,7 @@ happens, new versions of debuginfod will issue SQL statements to
 \fIdrop\fP all prior schema & data, and start over.  So, disk space
 will not be wasted for retaining a no-longer-useable dataset.
 
-In summary, if your system can bear a 0.5%-3% index-to-RPM-dataset
+In summary, if your system can bear a 0.5%-3% index-to-archive-dataset
 size ratio, and slow growth afterwards, you should not need to
 worry about disk space.  If a system crash corrupts the database,
 or you want to force debuginfod to reset and start over, simply
diff --git a/tests/ChangeLog b/tests/ChangeLog
index 6e3923f..8fcb161 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,3 +1,7 @@
+2019-12-02  Frank Ch. Eigler  <fche@redhat.com>
+
+	* run-debuginfod-find.sh: Adjust to "rpm"->"archive" in metrics.
+
 2019-11-26  Mark Wielaard  <mark@klomp.org>
 
 	* Makefile.am (BUILD_STATIC): Add libraries needed for libdw.
diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
index 0ade03b..c2926a0 100755
--- a/tests/run-debuginfod-find.sh
+++ b/tests/run-debuginfod-find.sh
@@ -162,7 +162,7 @@ cp -rp ${abs_srcdir}/debuginfod-rpms R
 kill -USR1 $PID1
 # All rpms need to be in the index
 rpms=$(find R -name \*rpm | wc -l)
-wait_ready $PORT1 'scanned_total{source="rpm"}' $rpms
+wait_ready $PORT1 'scanned_total{source="archive"}' $rpms
 
 kill -USR1 $PID1  # two hits of SIGUSR1 may be needed to resolve .debug->dwz->srefs
 # Expect all source files found in the rpms (they are all called hello.c :)
@@ -187,7 +187,7 @@ sourcefiles=$(find -name \*\\.debug \
 cd ..
 rm -rf extracted
 
-wait_ready $PORT1 'found_sourcerefs_total{source="rpm"}' $sourcefiles
+wait_ready $PORT1 'found_sourcerefs_total{source="archive"}' $sourcefiles
 
 # Run a bank of queries against the debuginfod-rpms test cases
 



More information about the Elfutils-devel mailing list