Should strip discard the .ctf section ?

Nick Alcock nick.alcock@oracle.com
Mon Oct 7 18:53:00 GMT 2019


On 6 Oct 2019, Fangrui Song said:
> On 2019-10-05, Nick Alcock wrote:
>>[sorry about the length of this :( ]
>>> Add a generic --keep-section option and let user specify
>>> --keep-section=.ctf if .ctf is to be retained. See
>>> https://llvm.org/docs/CommandGuide/llvm-objcopy.html#cmdoption-llvm-objcopy-keep-section
>>
>>(I don't really see the relevance of LLVM documentation here: this is
>>not LLVM, this is binutils. Also you're citing objcopy documentation,
>>not strip, though llvm-strip has a similar option. Binutils strip *does
>>not*. Its defaults are a *minimum* set, and as far as I know you cannot
>>say "oh actually I didn't want to remove section X after all", except
>>for one special-purpose option for source filename sections, which
>>doesn't apply to .ctf.)
>
> llvm-objcopy/llvm-strip have the option but objcopy/strip don't.
> I propose that binutils implements --keep-section= .
> It will solve not only the current .ctf problem, also future problems of
> other sections.

Maybe, but I'm... not aware that there is actually a problem that needs
a new flag to solve it here. There is a default that you dislike, but I
don't actually know why you dislike it: if it leads to worse behaviour
for end-users I'm not aware of it. That users get to keep .ctf sections
once they explicitly ask for them to be generated seems like a *benefit*
to me.

>>The argument from utility is that the whole point of CTF is that it's
>>compact enough that it can be left in binaries, so that programs can
>>rely on its being present: if leaving it in is not as uncontroversial as
>>leaving in .eh_frame, it is too large. Right now it is about 2% of text
>>size, i.e. about the same size as .eh_frame: very soon I expect it to be
>>much smaller because that 2% is the size before we're doing any type
>>deduplication. Is this *really* too high a price to pay for type
>>introspection into all C programs on the system? I am biased here
>>(obviously) but I'd vote no, given modern disk sizes (RAM sizes are
>>irrelevant, since the section is non-loaded).
>
> .eh_frame has the SHF_ALLOC flag. objcopy --strip* and strip should not strip SHF_ALLOC
> sections by default.
>
> Whether .ctf should be stripped by objcopy --strip-debug/objcopy
> --strip-all/strip --strip-all/strip --strip-debug depends on whether
> .ctf has the SHF_ALLOC flag.

Well... as you yourself have noted, there are other constraints: it's
not just a matter of whether a section is allocated, or strip would
render binaries it was run over useless (since there are a bunch of
non-loaded sections without which you can't load a binary at all). IIRC,
it actually depends on whether the section is marked as debuginfo
(SEC_DEBUGGING). CTF is not debuginfo, so it is not stripped out.
If you specify strip -s, it is removed (I think: I haven't tested this
case much, but strip -s is supposed to strip out everything and I didn't
touch that code so I imagine it still does).

> A noteworthy omission from the CTF patch series is that there is no test
> of the readelf -S/objdump -h output.

There are no tests of CTF stuff at all yet, and I completely agree that
there should be some. I was going to add a bunch of CTF tests after
doing the deduplicator: before that, the test output is ludicrously
voluminous and has a habit of varying its order for no good reason too:
the deduplicator will fix all of that and make testing the thing
practical. (I'm not sure why -S/-h in particular are considered so much
more important than the actual *content* of the section :) )

>>If we *do* strip out CTF by default, no-one will ever use it, because
>>packaging systems will routinely strip it out into the debuginfo
>>packages so it is never there. I've seen how hard it is to avoid
>>stripping this sort of thing out when packaging if strip does it by
>>default: when I was putting .ctf sections into kernel modules, I had to
>>*patch parts of RPM at runtime* to do it, and I had to do this on a
>>package-by-package basis. This is not something any sane person is ever
>>going to do, so in practice if strip(1) strips out .ctf, nobody who
>>wants their program to work when packaged by any major distro is ever
>>going to be able to use it, unless they're writing a debugger.
>
> If .ctf does not have the SHF_ALLOC flag, like DWARF .debug_*, I think
> the right behavior here is to add --keep-section=.ctf to the packaging
> systems where strip/strip -s is used. This does not require fixing more
> packages, because you already have to add -gt to CFLAGS.

Most packaging systems do not make it easy to pass arbitrary flags to
strip(1).

The most practical way I've found to do this in RPM involves patching
RPM at runtime (you have to dig up parts of the shell scripts RPM
invokes from /usr/lib/rpm, copy them into the build tree, patch them,
and hack the internal macros that invoke the script out of /usr/lib/rpm
so that they use your patched copy instead -- and the necessary patch is
RPM-version-dependent, as are the names of the macros you have to
replace).

You'll pardon me if I don't want to inflict *that* on end-users.

> Then, I don't think stripping .ctf by default will increase the
> deployment cost.

I have practical experience of this, and the scars. It does. :(

>>People who really *need* to save that last 2%-or-less of space can
>>always remove the .ctf section explicitly with objcopy if they need to
>>do so. (Though in practice all such people will just *not pass -gt* at
>>compile time. So I suspect nobody will ever do this.)
>>
>>> What sections are stripped and what are not are pretty complicated now.
>>
>>So... surely one extra rule is not a devastating cognitive load, I'd
>>have thought. Particularly if the increase in binary size is so small
>>that you can ignore it (and if it's not, I'm doing something wrong).
>
> Cognitive load and conceptual integrity is a big one.  The rules of

I'm trying to steer a middle road here.

On one side you can treat this just like debuginfo, and then it ends up
useless in the debuginfo packages and you can't actually use it unless
you jump through fairly horrifying hoops to get the right flags passed
to strip(1), even though you *already said* you wanted CTF when you
passed -gt to the compiler.

On the other side you can treat this just like other stuff that is part
of the program, in which case you mark it as loaded and suddenly it
requires a massive linker rewrite if it is to retain anything like the
current format (compressed, dependent on the contents of the ELF string
and symbol tables so we can eliminate dups and reshuffle some of our
stuff into symtab order), and also you can't use the same way to refer
to it when you open it as you can now (i.e. a filename) and suddenly all
users have to bifurcate their code paths to add a 'maybe I am looking at
my own types' loading pathway -- or just keep using the filename and
totally ignore the fact that the section is loaded, in which case why
did we do all this?

Actually that's another strike against making CTF into a loaded section.
To use loaded-section CTF, you'd either have to keep it uncompressed
always (when part of the design intent of this is that it's almost
always compressed when not in use), or uncompress it before use, which
would render having it in a loaded section actually *less efficient*
than having it in a non-loaded section (the decompressor necessarily is
going to uncompress the whole thing at once, but it would be reading the
thing in via multiple consecutive adjacent page faults rather than one
big read(): in fact even if it's uncompressed libctf starts out by
traversing the whole thing, incurring the same nontrivial cost).

So having CTF be non-loaded and stripped out makes it useless (and it is
almost impossible *not* to use the default strip options in most popular
packaging systems), and having it be loaded is an efficiency reduction,
an increase in complexity for all users that don't just *ignore* the
fact that it's loaded, and requires a massive rewrite of the linker.

So I decided not to take either path, since I don't like pointless pain
that much. Non-loaded, non-stripped is the only practical approach,
unless you can find some way to make the problems above less enormous.

>                           Just follow intuition/general principles and
> we will need fewer options to cover every use case.

The problem with following intuition is that everybody's intuition
differs. I didn't think this design was non-intuitive, merely the only
option that didn't hit intolerable practical problems.

-- 
NULL && (void)



More information about the Binutils mailing list