exec-file-mismatch and native-gdbserver testing

Pedro Alves palves@redhat.com
Sun May 17 21:19:38 GMT 2020


On 5/17/20 9:11 PM, Philippe Waroquiers wrote:
> On Sun, 2020-05-17 at 20:50 +0100, Pedro Alves wrote:
>>> E.g. I am wondering if the below will be visible and cause
>>> an (understandable) warning/error/behaviour for the user:
>>> If the user has debugged a first process with orig_exe,
>>> then the user copied orig_exe to copy_orig_exe, and then GDB is
>>> attached to a process that runs copy_orig_exe, the user does not expect
>>> to have orig_exe protected/accessed anymore, and so might change it
>>> or remove it or ..., while GDB still use orig_exe instead of copy_orig_exe.
>>
>> But this seems like a pretty benign problem?  But I'm not sure
>> I understood it.  What exactly goes wrong in this scenario?
> The user expects orig_exe to not be 'busy' anymore, and so
> expects to be able to freely modify it, without e.g. impacting
> the GDB session debugging the executable running copy_orig_exe.
> (I guess that orig_exe will not cause 'Text busy' error, as no
> process is still executing it from the kernel point of view).

Do you really see these "Text busy" errors nowadays?  I don't
think I ever saw those on GNU/Linux.

Still, I'm not seeing the same kind of problem that ending
up with the wrong binary loaded in GDB causes.  If you end
up with the wrong binary loaded in GDB, then GDB may
for example install breakpoints at the wrong addresses,
and that may even cause the inferior to crash, because the
breakpoint address may fall in the middle of instructions,
resulting in the inferior potentially executing invalid
instructions, or worse, executing valid instructions with
disastrous side effects.

The type of problem you're describing seems more like an
annoyance, which will be detected some other way ("Text busy"
or some other side effect), and the user can still fix it,
with e.g., the "file" command.

> 
>>
>>> So, I was wondering if such a case of equal build ID
>>> but different (local?) file names are not worth a warning.
>>
>> IMO it isn't, because it is very common to have different
>> filenames (if you consider the whole path) for executable
>> loaded in gdb compared to the executable that the process is
>> running when you consider remote debugging.
>>
>>>> I'm thinking, if we support build ID validation, do we really want
>>>> to fallback to filename validation?  It seems to me that it causes
>>>> more false positives than desirable.
>>> You mean that the filename comparison is useless (or even harmful)
>>> if we found the build ID in the files ?
>>> Effectively, if build ID are different but filenames are equal,
>>> that is likely a false positive 'file are matching'
>>> (only possible in remote debugging setup I suppose).
>>
>> No, I mean, let's consider the feature from scratch again.
>> I'm saying that IMHO filename comparison on its own is pretty
>> weak and annoyingly chatty.  I'd think e.g., a basename
>> match + segments match (compare addresses and sizes of 
>> of text, data, etc, segments) would already be much better.
>> But that's a path that's been considered in all other scenarios
>> where we have to match binaries, and ultimately, build ID
>> was invented to fix this kind of scenario without heuristics,
>> because heuristics can always fail.  
>>
>> So given that we can do buildid matching, shouldn't we just forget
>> all other kinds of matching, and just stick with build id matching,
>> with no fallback?  I.e., add build id matching, remove the filename
>> matching, and raise the bar for any fallback matching -- as in if
>> you want some fallback, it has to be better than just filenames.
>>
>> IIRC, the main motivation for the feature is when you attach to
>> a process running bar, while you have foo (completely unrelated to bar)
>> loaded in gdb.  GDB previously would assume that foo is the symbol file
>> for bar, so it gladly continued debugging bar with the foo binary.
>> Buildid detects this, and also detects the scenario of attaching to
>> a process that is running an older version of bar than the version
>> you have loaded in gdb (because you rebuilt the program before 
>> attaching, for example).
>>
>> More contrived use cases can be imagined, but it seems to me like
>> if you want to catch them, then you're better off making sure your
>> binaries include build ids.  Which is true by default on modern
>> GNU/Linux OSs at least.
> At my work, objdump -h some_exe does not show a build ID, not clear
> why (RHEL 7.8, but using gold linker from Adacore gnatpro).
> 
> So, my main original use case needs filename comparison :(.

According to:

 https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/developer_guide/compiling-build-id

"Each executable or shared library built with Red Hat Enterprise Linux Server 6 or later is assigned a unique identification 160-bit SHA-1 string, generated as a checksum of selected parts of the binary. "

Maybe older gold versions didn't emit the build id by default, while
GNU ld did.  I tried it with master gold, and it emits the build id 
by default.  does explicitly specifying --build-id on the link work?
Since you're already not using the default tools, you could tweak
your build system to explicitly request a build id?

> So, my main original use case needs filename comparison :(.

I think that doesn't follow -- you could say that the build id
isn't sufficient for you, and that you need a fallback, but 
that doesn't mean that the fallback must be the straight
full path filename comparison as is it today.

Thanks,
Pedro Alves



More information about the Gdb mailing list