patch 2/2 debuginfod server etc.

Mark Wielaard
Thu Nov 21 20:42:00 GMT 2019

Hi Frank,

On Thu, 2019-11-21 at 12:18 -0500, Frank Ch. Eigler wrote:
> If the perceived problem is that build tree scans (-F) may
> > > contain
> > > binaries that refer to source files that are not appropriate for
> > > later sharing, then IMO this is too much change, and unnecessarily
> > > complicates other valid usage.
> > 
> > Yes, that (and references to any other source files, whether those
> > scanned by -F or -R or simply because they are reachable on the file
> > system) is the problem that is being solved.
> Files "simply reachable on the file system" are not indexed or sought
> by debuginfod, unless (a) they are scanned due to being listed with an
> explicit PATH, OR (b) being referred to from within an -F DWARF file
> as a source.  That is all.  There is nothing else "reachable".

Yes, that is what I meant by reachable.

I think what makes the discussion somewhat difficult is that there are
basically three cases:

- Serving trees of rpms where only the contents of the rpms is shared.
- Serving of a build directory where it makes sense to share not just
  what is in the build directory but basically everything that might
  have been needed to create the artifacts in the build tree.
- Serving of specifically installed files. Which could be
  exploded rpms of installed packages (e.g. the contents of
  /usr/lib/debug and /usr/debug/src contents) or a specially prepared
  installation of debugging artifacts to share with some developer
  group. Where it makes sense to treat it like the first case and only
  share what you specify.

I think we kind of agree on the first and second case. The first is
simple, you want to share the contents of the rpms and that is what you
do. For the second since it is a local development build it kind of
makes sense to be somewhat permissive and share everything in your
development environment, which is basically anything that can be
reached through the debug data on your local file system.

It is the third case where it isn't so clear what the correct defaults
are because it is likely that you moved that contents from some
development environment to a dedicated server that now hosts it. In
this case you don't want to accidentally share files referenced that
you didn't explicitly copy over to the server.

To make it easier to distinguish these cases I split the path
processing so a user can easily set how much they want share. Where I
prefer the defaults so that you can easily combine case 1 and 3 without
over-sharing, which is what we do with the default debuginfod.service
setup. Then if you combine either with case 2 you would most likely set
things up so that your whole local setup/development environment is
shared and use -A. And if you do want case 2 and you are paranoid, you
can be explicit about what to share with -N and -S.

I think it makes sense to be explicit about which PATH is used for what
to make it easier to distinguish these different use cases.

> > > If you are certain that source file censorship needs to be in the
> > > code, I'd do it instead by adding just one option -S PATH to the code,
> > > which would act like a whitelist for -F source file retrievals.
> > > (There is no point to filtering -R rpm source files; those are only
> > > serviced from other indexed RPMs.)
> > 
> > By default all -F directories are already whitelisted. -S is just for
> > extra places where source could be found.
> We are speaking about hypothetical work, so "are already" is incorrect.
> There is no whitelist of source files from -F type searches "already".

That isn't hypothetical, that is what my patch implements.

> Contemplating a whitelist: it may easily be the case that the sources
> are relatively far from the build tree being scanned - indeed separate
> sources is how we recommend gnu tools be built.  Constructing the
> whitelist from the -F paths only is bound to be incomplete in this
> common usage scenario.

Right, this is case 2 above. And for that I expect people will simply
use -A to share their whole development environment/all local files
used in the build.

> > > So:
> > >     debuginfod -S /usr/src/debug -S /usr/include -F PATH1 PATH2 ... PATHn
> > > would restrict -F source service to the given paths, and
> > >     debuginfod -F PATH1 PATH2 
> > > would not, because normal people have trustworthy build systems etc.
> > 
> > I guess we differ on how trustworthy generated debug files are.
> All this work depends on debug files being trustworthy!  The man pages
> spell this out already.  Imagine a doctored debug file deliberately
> conflicting with a well-known buildid.  Or deliberately containing
> masses of garbage or harmful data.

Those are also cases that could use some more analysis, but they are
different from sharing a local file that you didn't intent to because
you only wanted to share specific files (because those are the only
files you installed), case 3 above.

> > What I would like is:
> > 
> > - By default only restrict the files served to those under the
> >   directory that the file-scanner uses (that is why I split the
> >   -R and -F cases).
> Why?  This is a tighter constraint than the problem statement at the
> top.  The only additional risk here would be from an
> file-scanner-found dwarf file that makes a source reference to a file
> in a directory that was already explicitly identified for RPM
> scanning, i.e., not a sensitive location.
> > - Have a more restrictive mode that simply doesn't add anything
> >   to the sources white list (that is -N in my patch).
> > - Have an anything goes mode (that is -A in my patch).
> > - Be able to whitelist more selectively (that is -S).
> IMHO, this is unnecessary complication.  Maybe you'll see this if you
> write out documentation and sample usage for all these cases.

OK, I can update the documentation and describe how the 3 usage
scenarios (or a combination of some of them) can be expressed with
these options.

> > If I understand you correctly (given your other email in reply to
> > why adding globbing support isn't enough), you also want a mode
> > where all extra arguments on the command line are interpreted as
> > "scannable" (either file based or rpm based).
> This is the normal behavior for unix tools.

But normal unix tools do just one thing. In this case the issue is that
debuginfod can do multiple things and might even combine them. So
treating all arguments as generic paths to provide to both the rpm-
scanner and file-scanner which can be used for different scenarios
seems what is confusing.

That said, I think it wouldn't be hard to add something so you can
simply say that all arguments are to be treated as either rpm dirs or
files dirs (--rpms-args/--files-args maybe).

> > So I think the real issue is the splitting of -R and -F argument
> > parsing. If that is the case, maybe just picking a default for how to
> > interpret the extra arguments, as dirs for the file scanner or dirs for
> > the rpm scanner or both, might make us both happy?
> The branch code does "both", because it is simple.

So if we make things simple again, but you do have to choose whether
the arguments/dirs are for the rpms or files scanner, would that make
you happy?



More information about the Elfutils-devel mailing list