[PATCH v2 4/5] Provide access to non SEC_HAS_CONTENTS core file sections

Kevin Buettner kevinb@redhat.com
Thu May 21 15:09:37 GMT 2020


On Thu, 21 May 2020 15:23:00 +0100
Pedro Alves <palves@redhat.com> wrote:

> On 5/21/20 1:40 PM, Pedro Alves via Gdb-patches wrote:
> > On 5/21/20 8:50 AM, Kevin Buettner wrote:  
> >> On Wed, 20 May 2020 17:45:28 +0100
> >> Pedro Alves <palves@redhat.com> wrote:
> >>  
> >>> So, as discussed at <https://sourceware.org/pipermail/gdb-patches/2020-May/168826.html>,
> >>> I still think that this is an heuristic, and that we're trading one
> >>> bug for another.  It does seem like the bug that we're fixing
> >>> is more important than the one we're introducing, so if there's really
> >>> nothing else in the core file that would distinguish the coremaker_ro
> >>> case from an anonymous mapping that should be read from the core (i.e.,
> >>> read as zeroes), then I suppose this is the best we can do, and the
> >>> patch looks good to me.  I would like to see the commit log and the
> >>> comments slightly adjusted to mention this, though.  
> >>
> >> On Linux, we're able to do "info proc mappings" when looking at a
> >> corefile.  Linux core files include a .note.linuxcore.file/pid section
> >> which contain the map data that you see when running "info proc
> >> mappings".
> >>
> >> It may be possible to use these mappings to provide a more accurate
> >> view of memory contents when working with a core file.  In addition
> >> to file backed data that we already know about from the executable and
> >> shared libraries, there may be additional file backed data that can
> >> be found via files listed in the .note section mentioned earlier.  Or,
> >> as I found while investigating that one of those weird F27 mappings,
> >> there may be data from a shared library which can't be found in
> >> what GDB knows about when loading the shared library.  (In that
> >> particular case, the memory region in question contained some
> >> vtable data prior to being remapped - though I'm guessing about
> >> the remapping bit.)
> >>
> >> If you think it worth pursuing, I can look at using the core-provided
> >> proc mapping information to provide a more accurate view of memory
> >> when looking at a core dump.
> >>
> >> If I do this, is it still worth pushing this v2 series first?  Or should
> >> it wait until I attempt to incorporate info from the .note section?
> >>  
> >>> Could you remind me again why is it that the core dump includes a
> >>> load segment for coremaker_ro in the first place?  If this is
> >>> read-only data, why did the kernel decide to output a bss-like
> >>> load segment for it?  
> >>
> >> On F27 and F28, coremaker_ro was placed on a page that also had
> >> read/write data; it got included in with that data when the core file
> >> was written.
> >>  
> > 
> > Ah, right.  In this case, reading from the core should always work.
> >   
> >> On F29 and beyond (due to use of particular bintutils verions/options
> >> which participated in making the executable), coremaker_ro was placed
> >> on a page with only read-only data.  In this case, the core file
> >> didn't include data for that page; it lets the debugger find that data
> >> from the executable.  
> > 
> > OK.  
> > 
> > I think I asked the wrong question.  Let's go back to what you described
> > in the commit log:
> >   
> >>    FAIL: gdb.base/corefile.exp: print coremaker_ro
> >>   
> >>    The reason for this is that all sections which have the BFD flag
> >>    SEC_ALLOC set, but for which SEC_HAS_CONTENTS is not set no longer
> >>    have zero size.  Some of these sections have data that can (and should)
> >>    be read from the executable.  (Sections for which SEC_HAS_CONTENTS
> >>    is set should be read from the core file; sections which do not have
> >>    this flag set need to either be read from the executable or, failing
> >>    that, from the core file using whatever BFD decides is the best value
> >>    to present to the user - it uses zeros.)
> >>    
> >>    At present, due to the way that the target strata are traversed when
> >>    attempting to access memory, the non-SEC_HAS_CONTENTS sections will be
> >>    read as zeroes from the process_stratum (which in this case is the
> >>    core file stratum) without first checking the file stratum, which is
> >>    where the data might actually be found.
> >>    
> >>    What we should be doing is this:
> >>    
> >>    - Attempt to access core file data for SEC_HAS_CONTENTS sections.
> >>    - Attempt to access executable file data if the above fails.
> >>    - Attempt to access core file data for non SEC_HAS_CONTENTS sections, if
> >>      both of the above fail.  
> > 
> > The question should have been -- why is there a !SEC_HAS_CONTENTS load segment
> > for the page that contains coremaker_ro in the core file, at all?  
> > Or rather, why did the kernel decide to output a .bss load segment with
> > no contents for that page?  If there wasn't one, then we wouldn't need this
> > heuristic.  This is looking like a kernel bug to me.  Like, if the mapping
> > was file backed and wasn't touched, then it should have been skipped.
> > If it was touched (or for some other reason that could justify dumping the
> > mapping), then the kernel should have included its current contents
> > in the dump, instead of indicating "no contents".  No?  
> 
> Blah, sorry, got distracted here while I write the above, and I didn't realize
> I still asked the wrong question.  I don't know why I talked
> about coremaker_ro above.  I meant to talk about those "Some of these sections
> have data that can (and should) be read from the executable.".  Why do we have
> segments for those in the core file?  Hopefully the rest of the
> comments make more sense now.
> 
> Thus, take 3:
> 
> The question should have been -- why are there SEC_ALLOC && !SEC_HAS_CONTENTS
> load segments for that data that should be read from the executable?
> Or rather, why did the kernel decide to output a .bss-like load segment with
> no contents for that mapping?  If there wasn't such a load segment
> in the core, then we wouldn't need this heuristic.  This is looking like a kernel
> bug to me.  Like, if the mapping was file backed and wasn't touched, then it
> should have been skipped.  If it was touched (or for some other reason
> that could justify dumping the mapping), then the kernel should have
> included its current contents in the dump, instead of
> indicating "no contents".  No?

The kernel simply dumps all of the memory that it knows about.  It
doesn't try to filter anything out.  Should it determine that an area
has a file based backing or that it was never written to (or several other
conditions: MADV_DONTDUMP, I/O areas, etc), it'll skip dumping the
contents, but will still dump a header.

If you want to look for yourself, see elf_core_dump() in
fs/binfmt_elf.c in the kernel sources.  This is one of the loops that
iterate through the linked list of vm_area_struct structs for the
process:

	for (i = 0, vma = first_vma(current, gate_vma); vma != NULL;
			vma = next_vma(vma, gate_vma)) {
		unsigned long dump_size;

		dump_size = vma_dump_size(vma, cprm->mm_flags);
		vma_filesz[i++] = dump_size;
		vma_data_size += dump_size;
	}

vma_dump_size contains the logic which decides whether an area's
contents will be dumped, but beyond that there's no filtering going
on.

Actually, it's better to look at the second (of three) loops; this one
writes the headers...

	/* Write program headers for segments dump */
	for (i = 0, vma = first_vma(current, gate_vma); vma != NULL;
			vma = next_vma(vma, gate_vma)) {
		struct elf_phdr phdr;

		phdr.p_type = PT_LOAD;
		phdr.p_offset = offset;
		phdr.p_vaddr = vma->vm_start;
		phdr.p_paddr = 0;
		phdr.p_filesz = vma_filesz[i++];
		phdr.p_memsz = vma->vm_end - vma->vm_start;
		offset += phdr.p_filesz;
		phdr.p_flags = vma->vm_flags & VM_READ ? PF_R : 0;
		if (vma->vm_flags & VM_WRITE)
			phdr.p_flags |= PF_W;
		if (vma->vm_flags & VM_EXEC)
			phdr.p_flags |= PF_X;
		phdr.p_align = ELF_EXEC_PAGESIZE;

		if (!dump_emit(cprm, &phdr, sizeof(phdr)))
			goto end_coredump;
	}

Kevin



More information about the Gdb-patches mailing list