This is the mail archive of the mailing list for the GDB project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Support gzip compressed exec and core files in gdb

On 03/12/2015 12:40 AM, Michael Eager wrote:
> On 03/11/15 15:13, Jan Kratochvil wrote:
>> On Wed, 11 Mar 2015 00:01:42 +0100, Michael Eager wrote:
>>> Add support to automatically unzip compressed executable and core files.
>>> Files will be uncompressed into temporary directory (/tmp or $TMPDIR)
>>> and are deleted when GDB exits.
>> Such feature has been requested to support xz-compressed core files as
>> currently being stored by systemd.  But to make it more convenient one should
>> decompress on-demand only the blocks of file that are really accessed by GDB
>> - expecting by bfd_iovec.  Obviously GDB usually needs to access only small
>> part of the whole core file.
>> I did not check how it is supported by gzip but for xz one needs to use
>> --block-size, otherwise the file blocks cannot be decompressed independently
>> in random access way.
> gzip is not compressed block-by-block.  As far as I can tell, you need to
> decompress starting from the beginning of the file.
>> ISTM libz-gzip and liblzma-xz compatibility is mutually exclusive.
> I don't know why they would be incompatible, but support for an on-demand
> block-compression scheme would be significantly different.  Decompressing
> an xz file by making a copy (as is done for gzip) would be a simple extension
> to the current patch.

I won't strongly object if others want to approve it, but IMO, having GDB
decompress the whole file to temp doesn't add that much convenience over
decompressing it outside GDB.  Let me explore and expand (below).

We need to weigh the convenience of having gdb do this, over maintaining
this inside BFD+GDB going forward.

If you loading the core just once to extract a backtrace, in an
automated fashion, you can simply decompress before loading the core
with a trivial script.  So the convenience added for this use case
is not significant.

If OTOH you're doing interactive debugging of the core dump, then it's
convenient to be able to skip manual/laborious steps.

However, if people are compressing cores, it's because they are big.

As a quick experiment, I ran 'gcore `pidof firefox`' and generated
a core dump of the firefox process that I have running.  That resulted
in a 4.5GB core dump.  I gzipped it, which shrinked it to 4.5MB.
A ~1000/1 factor.  Then I timed gunzipping it.  It took almost
2 (two) minutes.

If you're doing interactive debugging (either CLI, or GUI) you'll
likely end up starting gdb multiple times, and thus load the core multiple
times into gdb, and each of those invocations will result in a
slow decompression of the whole file.

Waiting for GDB to decompress that once is already painful.  Waiting for it
multiple times likely results in cursing and swearing at gdb's slow start
up.  Smart users will realize that and end up decompressing the file manually
outside gdb, just once, anyway, thus saving time.

We could "fix" the "multiple times" issue by adding even more smarts,
based on an already-decompressed-files cache or some such.  Though of
course, more smarts, more code to maintain.

I agree with Jan -- The real convenience would be being able to skip the
long whole-file decompression step altogether, with an on-demand
block-decompress scheme, because gdb in reality doesn't need to touch
most of the vast majority of the core dump's contents.  That would
be a solution that I'd be happy to see implemented.

If we're just decompressing to /tmp, then we also need to
compare the benefits of a built-in solution against having users
do the same with a user-provided gdb command implemented in one
of gdb's extensions languages (python, scheme), e.g., a command
that adds a "decompress-core" command that does the same:
decompresses whatever compression format, and loads the result
with "core /tmp/FILE".

IMO, whatever the solution, if built in, this is best implemented
in BFD, so that objdump can dump the same files gdb can.

Pedro Alves

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]