[PATCH] New testcase for PR tui/25126 (staled source cache)

Fri Feb 7 20:11:00 GMT 2020

On Friday, February 07 2020, I wrote:

> On Friday, February 07 2020, I wrote:
>
>> On Friday, February 07 2020, Andrew Burgess wrote:
>>> I'm not suggesting that you need to track down the cause of this
>>> issue, but I agree with Luis that we should avoid arbitrary short
>>> pauses.
>>>
>>> I think you could probably use gdb_get_line_number to solve this
>>> problem, something like this completely untested code:
>>>
>>>   # In some cases it has been observed that the file-system doesn't
>>>   # immediately reflect the rename.  Here we wait for the file to
>>>   # reflect the expected new contents.
>>>   proc wait_for_rename {} {
>>>       global srcfile
>>>       for { set i 0 } { $i < 5 } { incr i } {
>>>   	if { ![catch { gdb_get_line_number \
>>>                        "pattern only matching the new line" \
>>>                        ${srcfile} }] } {
>>>   	    return
>>>   	}
>>>   	sleep 1
>>>       }
>>>       error "file failed to rename correctly"
>>>   }
>>
>> Ah, cool.  I'll adjust that to the code.  Thank you.
>
> OK, after trying your code, I can say that the problem is not on TCL.
> wait_for_rename returns successfully, and I've checked that
> gdb_get_line_number returns the correct value for the line.  So, for
> TCL, the rename succeeded.
>
> Here's an interesting thing: I put a gdb_interact after the second "run"
> command, and then did:
>
>   (gdb) list
>   35        printf ("hello\n"); /* break-here */
>   (gdb) shell gdb.     
>   gdb.log  gdb.sum  
>   (gdb) shell outputs/gdb.base/cached-source-file/cached-source-file
>   foo
>   hello
>
> See how, for GDB, the inferior doesn't have the 'printf ("foo\n");'
> line, but when I run it externally I can see "foo" being printed?  This
> means that GCC compiled the correct file, but GDB did not load it again,
> somehow.
>
> I find it extremely interesting how putting a "sleep 1" after the rename
> magically solves this problem.  I would be less intrigued if we had to
> put "sleep 1" after "gdb_compile", because then it would hint at some
> race condition happening with GCC and GDB (very unlikely, but easier to
> understand).
>
> I didn't want to, but I guess I'll have to keep investigating this.
> Unless you (or someone) have any other ideas.

I think I found the issue.  On symfile.c:reread_symbols, the check
performed to see whether the new objfile being loaded is different than
the previous one is based on calling 'stat' and checking 'st_mtime':

    ...
      new_modtime = new_statbuf.st_mtime;
      if (new_modtime != objfile->mtime)
	{
	  printf_filtered (_("`%s' has changed; re-reading symbols.\n"),
			   objfile_name (objfile));
    ...

According to stat(2), 'st_mtime' is actually 'st_mtim.tv_sec', which
means the precision of this field is given in seconds.  Since Linux 2.6
'st_mtim's precision is given in nanoseconds, but we still use the
seconds field.

Because the testing script runs so fast, it's really likely that the old
and the new files will have the same 'st_mtime'.  Here's the output of
an 'fprintf' I put in the code:

    new_modtime = 1581105949, old_modtime = 1581105949

So yeah, we have a few options here:

1) For now, I think it's justifiable to use "sleep 1" in the code, to
force 'st_mtime' to be different between the two files.

2) The GDB code could be modernized to use nanosecond precision, which
should solve this problem.

Thanks,

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
http://sergiodj.net/