This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/2] Documentation and testcase


On Monday, March 23 2015, Pedro Alves wrote:

> On 03/23/2015 09:08 PM, Sergio Durigan Junior wrote:
>> On Monday, March 23 2015, Pedro Alves wrote:
>> 
>>> On 03/22/2015 08:45 PM, Sergio Durigan Junior wrote:
>>>
>>>> +# We do not do file-backed mappings in the test program, but it is
>>>> +# important to test this anyway.  One way of performing the test is to
>>>> +# load GDB with a corefile but without a binary, and then ask for the
>>>> +# disassemble of a function (i.e., the binary's .text section).  GDB
>>>> +# should fail in this case.  However, it must succeed if the binary is
>>>> +# provided along with the corefile.  This is what we test here.
>>>
>>> It seems like we now just miss the case of corefilter that _does_ request
>>> that the file backed regions are dumped.  In that case, disassembly
>>> should work without the binary.  Could you add that too, please?  We
>>> can e.g., pass a boolean parameter to test_disasm to specify whether
>>> to expect that disassembly works without a program file.
>> 
>> Hm, I'm afraid there's a bit of confusion here, at least from my part.
>> 
>> I am already testing the case when we use a value that requests that
>> file-backed regions are dumped.  If you take a look at the
>> "all_anon_corefiles" list, you will see that the each corefile generated
>> there includes everything *except* for the specific type of mapping we
>> want to ignore (thus the "non_*" names).  
>> And the result of this test is
>> that GDB cannot disassemble a function without a binary, even if all the
>> file-backed pages have been dumped.
>
> Now I'm confused.  If all the file-backed pages have been dumped,
> then aren't the .text present in the core dump?  If that doesn't work,
> we've just caught a bug somewhere.

The bug was my understanding of the disassemble command :-).

>> 
>> Having said that, I made a test with git HEAD without my patch.  I
>> generated a corefile for the same test program, and then loaded only the
>> corefile:
>> 
>>   $ ./gdb -q -ex 'core ./core.31118' -ex 'disas 0x4007cb'
>>   ...
>>   Program terminated with signal SIGTRAP, Trace/breakpoint trap.
>>   #0  0x0000000000400905 in ?? ()
>>   No function contains specified address.
>>   (gdb) 
>> 
>> Which means that, even without my patch, GDB still cannot disassemble a
>> function without the binary.
>
> Oh, without symbols, you need to tell "disassemble" an address range
> to disassemble, not just an address.  Like, "disassemble 0x4007cb, +10".
> Otherwise that fails even before a memory read was ever attempted, while
> gdb was looking for the function's boundaries.

Thanks, this makes sense.  Therefore, my testcase is wrong, because it
is actually testing whether the disassemble command identifies this case
and reacts accordingly, instead of disassembling main...

> I tried poking at coredump_filter now and, and I'm actually seeing
> the opposite.  I can always disassemble `main'.
>
> $ gdb segv -c core.22587
> ...
> Core was generated by `./segv'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00000000004004a5 in main () at segv.c:5
> 5         *(volatile int *)0;
> (gdb) x /i $pc
> => 0x4004a5 <main+9>:   mov    (%rax),%eax
>
> $ gdb -c core.22587
> ...
> (gdb) x /i $pc
> => 0x4004a5:    mov    (%rax),%eax
>
> The reason that works is that `main' happens to end up in the
> first page of the text segment, and that one ends up always
> dumped, as the dynamic loader touches it...

This is strange.  It should not really matter if the dynamic loader
touches it or not, because GDB with my patch (or the Linux kernel,
AFAIK) is actually interested only if the page should be dumped or not
according to coredump_filter.

> I do see that kernel generated cores do get bigger if I set
> file-backed bits in coredump_filter:
>
> $ ls -s --hu  core.*
> 2.3M core.22528  112K core.22587
>
> Bah, can't immediately think of a portable way to test this now.

So, I did more investigation on this.  Here is what I found.

The Linux kernel uses the bit 4 on the coredump_filter file to determine
whether it should dump ELF headers or not.  According to vma_dump_size:

  /*
   * If this looks like the beginning of a DSO or executable mapping,
   * check for an ELF header.  If we find one, dump the first page to
   * aid in determining what was mapped here.
   */
  if (FILTER(ELF_HEADERS) &&
      vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ)) {
          u32 __user *header = (u32 __user *) vma->vm_start;
          u32 word;
          mm_segment_t fs = get_fs();
          /*
           * Doing it this way gets the constant folded by GCC.
           */
          union {
                  u32 cmp;
                  char elfmag[SELFMAG];
          } magic;
          BUILD_BUG_ON(SELFMAG != sizeof word);
          magic.elfmag[EI_MAG0] = ELFMAG0;
          magic.elfmag[EI_MAG1] = ELFMAG1;
          magic.elfmag[EI_MAG2] = ELFMAG2;
          magic.elfmag[EI_MAG3] = ELFMAG3;
          /*
           * Switch to the user "segment" for get_user(),
           * then put back what elf_core_dump() had in place.
           */
          set_fs(USER_DS);
          if (unlikely(get_user(word, header)))
                  word = 0;
          set_fs(fs);
          if (word == magic.cmp)
                  return PAGE_SIZE;
  }

So maybe this is what you meant above by "that one ends up always
dumped...", when refering to the first page of the text segment?  Well,
that is partially true: if you unset bit 4, you will see that this page
does not get dumped at all (and therefore we see the "Cannot access
memory..." error; I did some experiments here and confirmed that).

GDB does not honor bit 4, so it will only depend on the file-backed page
to be dumped in order to be able to disassemble things.  And while doing
the tests with my patch, I noticed that it is not always doing the right
thing about anonymous and file-backed mappings (argh).  Sometimes, it is
dumping file-backed private mappings even when I tell it not to do that,
and the reason is this:

  00400000-00401000 r-xp 00000000 fd:03 10914398 /path/to/file
  Size:                  4 kB
  Rss:                   4 kB
  Pss:                   4 kB
  Shared_Clean:          0 kB
  Shared_Dirty:          0 kB
  Private_Clean:         0 kB
  Private_Dirty:         4 kB
  Referenced:            4 kB
  Anonymous:             4 kB
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  AnonHugePages:         0 kB
  Swap:                  0 kB
  KernelPageSize:        4 kB
  MMUPageSize:           4 kB
  Locked:                0 kB
  VmFlags: rd ex mr mw me dw sd 

This is the .text segment of the test program.  It is a file-backed
private mapping, *but* it also contains anonymous contents.  For this
reason, my patch is considering this mapping as anonymous, because the
Linux kernel kind of does the same thing (again, from vma_dump_size):

  /* Dump segments that have been written to.  */
  if (vma->anon_vma && FILTER(ANON_PRIVATE))
          goto whole;

However, if we look below in the code:

  if (vma->vm_file == NULL)
          return 0;

  if (FILTER(MAPPED_PRIVATE))
          goto whole;

Therefore, if *also* considers tha case when the mapping is file-backed
private (which my patch doesn't do).

All this boils down to: my patch is incorrectly dumping the .text
segment when I ask it not to do that (i.e., when I ask it to ignore
file-backed private mappings and to dump anonymous private mappings),
and it is *not* dumping the .text segment when I ask it to dump it
(i.e., when I ask it to dump file-backed private mappings and to ignore
anonymous private mappings).

So, here's what I propose: I will rework this part of the patch and try
to come up with a better way of identifying these situations (mainly:
when a file-backed mapping has anonymous contents), and I will resubmit
it tomorrow.  Along with that, I should be able to extend the testcase
to cover the disassemble case (and it should start to work fine once I
make those adjustments).

Phew!  What a confusion...  :-/.  I hope things are clearer with this
e-mail.

WDYT?

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]