Bug 16092 - The 'gdb' 'gcore' command ignores coredump_filter, and 'madvise(,,MADV_DONTDUMP)'.
Summary: The 'gdb' 'gcore' command ignores coredump_filter, and 'madvise(,,MADV_DONTDU...
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: corefiles (show other bugs)
Version: 7.6
: P2 normal
Target Milestone: ---
Assignee: Sergio Durigan Junior
URL:
Keywords:
Depends on: 11608
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-26 12:28 UTC by Jeff Byers
Modified: 2015-03-31 23:40 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jeff Byers 2013-10-26 12:28:44 UTC
The 'gdb' 'gcore' command ignores coredump_filter, and 'madvise(,,MADV_DONTDUMP)'.

On Linux x86_64, using 'gdb' version:

  GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)

and also the latest version of 'gdb' '7.6.1' built from
source, the 'gcore' command used to take a "live" core of a
process ignores the Linux '/proc/PID/coredump_filter' bit
settings, and also 'mmap'ed memory madvise(,,MADV_DONTDUMP)'
settings and always dumps all of the address space
regardless of attempts to limit it.

Note that crash cores do not do this, and obey the filter
and madvise settings.

This is a real problem when the memory allocations are
large, and especially when there are large files mapped, but
sparsely accessed. Using 'gdb' and 'gcore' caused the
complete file to be read in and included it is also included
in the core.

There may be cases where overriding the coredump_filter and
'madvise(,,MADV_DONTDUMP)' are useful, but there should also
be a way to obey these settings and not have monstrously
huge 'gcore' generated core files.
Comment 1 Sergio Durigan Junior 2015-03-05 20:43:38 UTC
Patch posted: https://sourceware.org/ml/gdb-patches/2015-03/msg00144.html
Comment 2 cvs-commit@gcc.gnu.org 2015-03-31 23:36:05 UTC
The master branch has been updated by Sergio Durigan Junior <sergiodj@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=df8411da087dc05481926f4c4a82deabc5bc3859

commit df8411da087dc05481926f4c4a82deabc5bc3859
Author: Sergio Durigan Junior <sergiodj@redhat.com>
Date:   Tue Mar 31 19:32:34 2015 -0400

    Implement support for checking /proc/PID/coredump_filter
    
    This patch, as the subject says, extends GDB so that it is able to use
    the contents of the file /proc/PID/coredump_filter when generating a
    corefile.  This file contains a bit mask that is a representation of
    the different types of memory mappings in the Linux kernel; the user
    can choose to dump or not dump a certain type of memory mapping by
    enabling/disabling the respective bit in the bit mask.  Currently,
    here is what is supported:
    
      bit 0  Dump anonymous private mappings.
      bit 1  Dump anonymous shared mappings.
      bit 2  Dump file-backed private mappings.
      bit 3  Dump file-backed shared mappings.
      bit 4 (since Linux 2.6.24)
             Dump ELF headers.
      bit 5 (since Linux 2.6.28)
             Dump private huge pages.
      bit 6 (since Linux 2.6.28)
             Dump shared huge pages.
    
    (This table has been taken from core(5), but you can also read about it
    on Documentation/filesystems/proc.txt inside the Linux kernel source
    tree).
    
    The default value for this file, used by the Linux kernel, is 0x33,
    which means that bits 0, 1, 4 and 5 are enabled.  This is also the
    default for GDB implemented in this patch, FWIW.
    
    Well, reading the file is obviously trivial.  The hard part, mind you,
    is how to determine the types of the memory mappings.  For that, I
    extended the code of gdb/linux-tdep.c:linux_find_memory_regions_full and
    made it rely *much more* on the information gathered from
    /proc/<PID>/smaps.  This file contains a "verbose dump" of the
    inferior's memory mappings, and we were not using as much information as
    we could from it.  If you want to read more about this file, take a look
    at the proc(5) manpage (I will also write a blog post soon about
    everything I had to learn to get this patch done, and when I it is ready
    I will post it here).
    
    With Oleg Nesterov's help, we could improve the current algorithm for
    determining whether a memory mapping is anonymous/file-backed,
    private/shared.  GDB now also respects the MADV_DONTDUMP flag and does
    not dump the memory mapping marked as so, and will always dump
    "[vsyscall]" or "[vdso]" mappings (just like the Linux kernel).
    
    In a nutshell, what the new code is doing is:
    
    - If the mapping is associated to a file whose name ends with
      " (deleted)", or if the file is "/dev/zero", or if it is "/SYSV%08x"
      (shared memory), or if there is no file associated with it, or if
      the AnonHugePages: or the Anonymous: fields in the /proc/PID/smaps
      have contents, then GDB considers this mapping to be anonymous.
      There is a special case in this, though: if the memory mapping is a
      file-backed one, but *also* contains "Anonymous:" or
      "AnonHugePages:" pages, then GDB considers this mapping to be *both*
      anonymous and file-backed, just like the Linux kernel does.  What
      that means is simple: this mapping will be dumped if the user
      requested anonymous mappings *or* if the user requested file-backed
      mappings to be present in the corefile.
    
      It is worth mentioning that, from all those checks described above,
      the most fragile is the one to see if the file name ends with
      " (deleted)".  This does not necessarily mean that the mapping is
      anonymous, because the deleted file associated with the mapping may
      have been a hard link to another file, for example.  The Linux
      kernel checks to see if "i_nlink == 0", but GDB cannot easily do
      this check (as it has been discussed, GDB would need to run as root,
      and would need to check the contents of the /proc/PID/map_files/
      directory in order to determine whether the deleted was a hardlink
      or not).  Therefore, we made a compromise here, and we assume that
      if the file name ends with " (deleted)", then the mapping is indeed
      anonymous.  FWIW, this is something the Linux kernel could do
      better: expose this information in a more direct way.
    
    - If we see the flag "sh" in the VmFlags: field (in /proc/PID/smaps),
      then certainly the memory mapping is shared (VM_SHARED).  If we have
      access to the VmFlags, and we don't see the "sh" there, then
      certainly the mapping is private.  However, older Linux kernels (see
      the code for more details) do not have the VmFlags field; in that
      case, we use another heuristic: if we see 'p' in the permission
      flags, then we assume that the mapping is private, even though the
      presence of the 's' flag there would mean VM_MAYSHARE, which means
      the mapping could still be private.  This should work OK enough,
      however.
    
    Finally, it is worth mentioning that I added a new command, 'set
    use-coredump-filter on/off'.  When it is 'on', it will read the
    coredump_filter' file (if it exists) and use its value; otherwise, it
    will use the default value mentioned above (0x33) to decide which memory
    mappings to dump.
    
    gdb/ChangeLog:
    2015-03-31  Sergio Durigan Junior  <sergiodj@redhat.com>
    	    Jan Kratochvil  <jan.kratochvil@redhat.com>
    	    Oleg Nesterov  <oleg@redhat.com>
    
    	PR corefiles/16092
    	* linux-tdep.c: Include 'gdbcmd.h' and 'gdb_regex.h'.
    	New enum identifying the various options of the coredump_filter
    	file.
    	(struct smaps_vmflags): New struct.
    	(use_coredump_filter): New variable.
    	(decode_vmflags): New function.
    	(mapping_is_anonymous_p): Likewise.
    	(dump_mapping_p): Likewise.
    	(linux_find_memory_regions_full): New variables
    	'coredumpfilter_name', 'coredumpfilterdata', 'pid', 'filterflags'.
    	Removed variable 'modified'.  Read /proc/<PID>/smaps file; improve
    	parsing of its information.  Implement memory mapping filtering
    	based on its contents.
    	(show_use_coredump_filter): New function.
    	(_initialize_linux_tdep): New command 'set use-coredump-filter'.
    	* NEWS: Mention the possibility of using the
    	'/proc/PID/coredump_filter' file when generating a corefile.
    	Mention new command 'set use-coredump-filter'.
    
    gdb/doc/ChangeLog:
    2015-03-31  Sergio Durigan Junior  <sergiodj@redhat.com>
    
    	PR corefiles/16092
    	* gdb.texinfo (gcore): Mention new command 'set
    	use-coredump-filter'.
    	(set use-coredump-filter): Document new command.
    
    gdb/testsuite/ChangeLog:
    2015-03-31  Sergio Durigan Junior  <sergiodj@redhat.com>
    
    	PR corefiles/16092
    	* gdb.base/coredump-filter.c: New file.
    	* gdb.base/coredump-filter.exp: Likewise.
Comment 3 Sergio Durigan Junior 2015-03-31 23:40:32 UTC
Fixed.