30541 – Add target valgrind

Bug 30541 - Add target valgrind

Summary: Add target valgrind

Status:	NEW

Alias:	None

Product:	gdb
Classification:	Unclassified
Component:	gdb (show other bugs)
Version:	unknown

Importance:	P2 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2023-06-12 13:14 UTC by Mark Wielaard
Modified:	2023-08-25 15:50 UTC (History)
CC List:	6 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
sample patch (1.58 KB, patch) 2023-07-10 21:36 UTC, Tom Tromey	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Mark Wielaard 2023-06-12 13:14:36 UTC

Since valgrind 3.21.0 the valgrind gdbserver (or actually vgdb) has a --multi mode that enables the extended remote protocol. This simplifies the usage of valgrind with gdb, but still leaves the user having to setup a few things that gdb could do if it detects/knows the remote target is valgrind.

"target valgrind" would/should do the following:

- set sysroot /
- set remote exec-file $binary (how to know what the local binary file is?)
- target extended-remote | vgdb --multi
- enable stdio/out "sharing" (currently being "eaten" by remote protocol packets using |)
  https://sourceware.org/bugzilla/show_bug.cgi?id=28916

Comment 1 Tom Tromey 2023-06-13 17:01:18 UTC

I wonder if we could have a new "target extended-remote-local" that
knows that the target sysroot is just "/".  Then "target valgrind"
could just be a small wrapper around this.

Comment 2 Tom Tromey 2023-06-13 17:03:58 UTC

BTW the "local remote" thing would also help with replacing remote-sim,
see bug#16442.

Comment 3 Pedro Alves 2023-06-15 11:22:59 UTC

Is there a reason "target valgrind" needs to be built-in to gdb?

You can define custom "target foo" commands, like:

 (gdb) define target valgrind
 Type commands for definition of "target foobar".
 End with a line saying just "end".
 >p 1
 >p 2
 >p 3
 >end
 (gdb) target valgrind
 $1 = 1
 $2 = 2
 $3 = 3
 (gdb) 

... so that could go in some script put in the system gdbinit dir.

(IIRC, back in my day, CodeSourcery was using this feature for some "target foo" which I don't recall which it was.)

Comment 4 Tom Tromey 2023-06-15 12:54:36 UTC

(In reply to Pedro Alves from comment #3)
> Is there a reason "target valgrind" needs to be built-in to gdb?

No, in fact it shouldn't be, I think.
However gdb might need to provide some help to make it work nicely.
For example valgrind's implementation could set sysroot, but that's
unfriendly to other targets.

Comment 5 Mark Wielaard 2023-06-15 13:13:37 UTC

(In reply to Pedro Alves from comment #3)
> Is there a reason "target valgrind" needs to be built-in to gdb?

Paul (who doesn't seem to have a sourceware bugzilla account) tried to create a user command for it and could completely make it work. In particular I believe getting/setting the correct remote exec-file seemed tricky and we would like the target to take arguments which are then added to the remote program invocation.

> ... so that could go in some script put in the system gdbinit dir.

Which dir is that and will that be autoloaded then? Another problem was that it cannot be a normal autoload gdb script since those aren't loaded early enough. Otherwise it could have been part of valgrind-monitor.py.

See https://sourceforge.net/p/valgrind/mailman/message/37804321/

Comment 6 Pedro Alves 2023-06-15 13:31:15 UTC

> Which dir is that and will that be autoloaded then?

From gdb/configure --help:

  --with-system-gdbinit-dir=PATH
                          automatically load system-wide gdbinit files from
                          this directory

See git ed2a22295102.

Comment 7 Paul Floyd 2023-06-15 15:40:59 UTC

(In reply to Mark Wielaard from comment #5)
> (In reply to Pedro Alves from comment #3)
> > Is there a reason "target valgrind" needs to be built-in to gdb?
> 
> Paul (who doesn't seem to have a sourceware bugzilla account) tried to
> create a user command for it and could completely make it work. In
> particular I believe getting/setting the correct remote exec-file seemed
> tricky and we would like the target to take arguments which are then added
> to the remote program invocation.

> See https://sourceforge.net/p/valgrind/mailman/message/37804321/

No account - fixed.

What I was trying to do was just

gdb myexe
(gdb) start_vg [list of Valgrind arguments]

As described in the e-mail above, there were two things that I found difficult

1. Getting the exe name, which I solved with help from SO
2. Passing the commands on to valgrind - user-defined commands don't have a $@ or $* like shells as far as I know

I guess this would be easier in Python but I'm a bit useless at scripting.

Comment 8 Pedro Alves 2023-06-15 17:04:26 UTC

> user-defined commands don't have a $@ or $* like shells as far as I know

Yeah.  There's $argc, and then $arg0..$argN, though.  So you can loop over the arguments and build the valgrind command that way.  You'd probably use Python to actually concatenate the strings.

> I guess this would be easier in Python but I'm a bit useless at scripting.

I guess you'd start with something like this:

$ cat target-valgrind.py
import gdb

class TargetValgrind(gdb.Command):
    def __init__(self):
        super(TargetValgrind, self).__init__("target valgrind", gdb.COMMAND_USER)

    def invoke(self, arg, from_tty):
        gdb.execute("target extended-remote | vgdb --multi " + " ".join(arg.split()))

TargetValgrind()

Source that in gdb, and then enter "(gdb) target valgrind foo bar"

Comment 9 Tom Tromey 2023-06-20 17:54:46 UTC

Maybe settings like sysroot should be per-target.
That would enable complicated multi-target debug scenarios
with multiple sysroots, and would solve this problem as well.

Comment 10 Tom Tromey 2023-07-06 15:55:01 UTC

(In reply to Tom Tromey from comment #9)
> Maybe settings like sysroot should be per-target.
> That would enable complicated multi-target debug scenarios
> with multiple sysroots, and would solve this problem as well.

I found this code in remote.c today:

bool
remote_target::filesystem_is_local ()
{
  /* Valgrind GDB presents itself as a remote target but works
     on the local filesystem: it does not implement remote get
     and users are not expected to set a sysroot.  To handle
     this case we treat the remote filesystem as local if the
     sysroot is exactly TARGET_SYSROOT_PREFIX and if the stub
     does not support vFile:open.  */
...

While this seems mildly questionable as a heuristic, at the
same time it may mean that there's nothing to do here for the
time being.

Comment 11 Mark Wielaard 2023-07-06 16:44:04 UTC

(In reply to Tom Tromey from comment #10)
> (In reply to Tom Tromey from comment #9)
> > Maybe settings like sysroot should be per-target.
> > That would enable complicated multi-target debug scenarios
> > with multiple sysroots, and would solve this problem as well.
> 
> I found this code in remote.c today:
> 
> bool
> remote_target::filesystem_is_local ()
> {
>   /* Valgrind GDB presents itself as a remote target but works
>      on the local filesystem: it does not implement remote get
>      and users are not expected to set a sysroot.  To handle
>      this case we treat the remote filesystem as local if the
>      sysroot is exactly TARGET_SYSROOT_PREFIX and if the stub
>      does not support vFile:open.  */
> ...
> 
> While this seems mildly questionable as a heuristic, at the
> same time it may mean that there's nothing to do here for the
> time being.

Interesting. So that works if you don't set a sysroot so that it is just "target:". But it does this vFile:open probe which always produces:

warning: remote target does not support file transfer, attempting to access files from local filesystem.

Which makes it look like something is wrong. We really like to make the gdb/valgrind integration as clean as possible without the user getting all kinds of odd warnings.

Comment 12 Tom Tromey 2023-07-10 21:36:20 UTC

Created attachment 14959 [details]
sample patch

"target local-server" is maybe pretty easy to implement...
I didn't test this, want to give it a try?

Comment 13 Pedro Alves 2023-07-11 08:18:41 UTC

It'd be great if we could make GDB auto-detect this.  Like, make it check whether host and target are the same machine.  For instance, by comparing /etc/machine-id, or other sources.

Comment 14 Pedro Alves 2023-07-11 08:24:53 UTC

It just occured to me that we could also teach "target remote" about command line options, like:

  target remote -foo on|off -bar -qux HOST

Comment 15 Pedro Alves 2023-07-11 08:26:57 UTC

(to be clear, that would let us do "target remote -local HOST", for example.)

Comment 16 Tom Tromey 2023-07-11 12:23:13 UTC

I didn't even know about machine-id.
That would be great except the man page says we have to
cryptographically hash it before sending it over the
network... ugh.

Comment 17 Pedro Alves 2023-07-11 17:27:51 UTC

A simple alternative is for GDB to write a random hash to a mktemp file, and then target_fileio_open("target:/tmp/$that-tmp-file"), and then:

 - if the file opens succesfully, and,
 - the random string is the same as what gdb put in the file, then,

the system is the same, and we can stop using target_fileio and open files directly.

Well, at least the /tmp mount is the same, but, that seems like a good proxy, isn't it?

Comment 18 Pedro Alves 2023-07-11 17:29:50 UTC

(of course, wouldn't help valgrind since it doesn't implement vFile:open.  boooo.)

Comment 19 Tom Tromey 2023-07-11 18:46:03 UTC

Whatever we do we'll need a new request on the valgrind side.
If it implemented vFile you could use "target remote | ssh mumble valgrind"

I think the machine-id is documented that way for privacy, not 
security -- there's no consequence if someone finds out your
machine-id.  So, we could maybe just ignore that text.

Or, we could use an existing thing like the CRC that gdbserver
already uses, or SHA1.  That seems like it would also be fine.

The main thing is we have to document the procedure somehow.

Not sure what to use on Windows.

I'm mildly worried that there is some scenario where the ids
clash by accident, or where /tmp is shared but nothing else is.
May not really be worth worrying about.

Comment 20 Pedro Alves 2023-07-11 19:44:11 UTC

> If it implemented vFile you could use "target remote | ssh mumble valgrind"

Yeah.

> The main thing is we have to document the procedure somehow.

If we rely on vFile without any obfuscation, it becomes all GDB implementation detail, because it's just:

GDB does target_fileio_open("target:/etc/machine-id") + target_fileio_read, and compares contents with its own /etc/machine-id.

Very similar to the /tmp/ idea.

The /tmp idea would work on Windows too, I think.

Maybe we could also try target_fileio_stat("target:/path/to/valgrind") and compare to 
stat("/path/to/valgrind")

Maybe we could even do both, try /etc/machine-id, and fallback to /tmp.  And then to stat.  And whatever other file-based fingerprinting method we come up with to try to rule out partially-shared filesystems.

If we need obfuscation, then we could I guess add a new TARGET_OBJECT_MACHINE_ID, and qXfer:read:machine-id packet to go along with it, and document what that means for the different OSs.

Comment 21 Pedro Alves 2023-07-11 19:50:08 UTC

Just for the record:

> to be clear, that would let us do "target remote -local HOST", for example.)

I think this would be a better approach than a new "target local-server".  We use to have more variants of "target remote" and got them down to two "target remote" + "target extended-remote", and that's one too many still.

Also since remote options are per-connection nowadays, we could have a corresponding "set remote local-filesystem" option, like we have matching "set print foo", and "print -foo".  The set option would also let you do "with remote local-filesystem -- target remote | valgrind".

Comment 22 Mark Wielaard 2023-07-11 21:39:39 UTC

(In reply to Pedro Alves from comment #21)
> Just for the record:
> 
> > to be clear, that would let us do "target remote -local HOST", for example.)
> 
> I think this would be a better approach than a new "target local-server". 
> We use to have more variants of "target remote" and got them down to two
> "target remote" + "target extended-remote", and that's one too many still.
> 
> Also since remote options are per-connection nowadays, we could have a
> corresponding "set remote local-filesystem" option, like we have matching
> "set print foo", and "print -foo".  The set option would also let you do
> "with remote local-filesystem -- target remote | valgrind".

Note that there are a couple more "remote local" issues beside the filesystem.

There is the environment variables, currently we rely on target extended-remote | vgdb --multi started the vgdb process with the same environment that the current gdb process has (important for example for PATH or LD_LIBRARY_PATH settings).
And there is the current working directory, which we also rely on getting inherited by being launched by gdb.

For both these issues things stop working when we launch things through a socket, even though there are gdb remote protocol packets to set them up like gdb has them. It would be nice to have a way to request and/or explicitly set both the environment and cwd as gdb knows them for a "remote local" setup.

Comment 23 Pedro Alves 2023-07-12 09:05:31 UTC

> For both these issues things stop working when we launch things through a socket, even though there are gdb remote protocol packets to set them up like gdb has them. 
> It would be nice to have a way to request and/or explicitly set both the environment and cwd as gdb knows them for a "remote local" setup.

Can you expand a bit more on how you'd see that working, on what is the use case for connecting via socket like?  Is that about users spawning vgdb themselves, or there being some central vgdb service or the like?

It just seems like setting up environment and cwd could be done by a wrapper script or whatever invokes vgdb.

Comment 24 Pedro Alves 2023-07-12 10:26:39 UTC

I just found out that flatpak bind mounts /etc/machine-id in containers:

  https://github.com/flatpak/flatpak/issues/4311

so maybe not ideal to infer whether the whole filesystem is shared...

I learned from that issue about /proc/sys/kernel/random/boot_id: 

  "a UUID that uniquely identifies a run of the system 
   (it is reset to a new random value when you reboot)."

Sounds like that one would be better.

Comment 25 Pedro Alves 2023-07-12 10:29:46 UTC

Bah, I keep having to correct myself.  I had done a bogus quick test without realizing.

> Sounds like that one would be better.

No, it wouldn't, /proc is mounted in the container, so /proc/sys/kernel/random/boot_id has the same contents in and out of the container, while rest of filesystem is restricted/different inside the container.  (confirmed with "flatpak enter" and poking at things.)

Comment 26 Mark Wielaard 2023-07-13 13:38:00 UTC

(In reply to Pedro Alves from comment #18)
> (of course, wouldn't help valgrind since it doesn't implement vFile:open. 
> boooo.)

It wouldn't be too hard to implement of course. But currently gdb seems to rely on Valgrind not having it so it can detect whether it is speaking to vgdb or not. See gdb/remote.c (remote_target::filesystem_is_local).

It isn't a great detection though, since it produces a warning that imho confuses the user.

But if you think using vFile:open will make a good interface for detecting whether the local filesystem == remote filesystem we can certainly implement it.

Comment 27 Mark Wielaard 2023-07-13 13:59:34 UTC

(In reply to Pedro Alves from comment #23)
> > For both these issues things stop working when we launch things through a socket, even though there are gdb remote protocol packets to set them up like gdb has them. 
> > It would be nice to have a way to request and/or explicitly set both the environment and cwd as gdb knows them for a "remote local" setup.
> 
> Can you expand a bit more on how you'd see that working, on what is the use
> case for connecting via socket like?  Is that about users spawning vgdb
> themselves, or there being some central vgdb service or the like?

We prefer vgdb being launched from within gdb (target extended-remote | vgdb --multi) and get the environment/cwd inherited that way.

But currently that means vgdb will communicate through stdin/out with gdb, so stdin/out aren't available for the inferior. This is being worked on by Sasha for https://sourceware.org/bugzilla/show_bug.cgi?id=28916 at https://git.sr.ht/~sasshka/binutils-gdb/log/split_fd but we aren't there yet (switching the file descriptors used for communication seems to work, but the whole terminal handling is a little complex).

So for programs that currently need to work with "normal" stdin/out you'll have to run in a separate terminal with vgdb --multi --port 5555 and then in gdb use terget extended-remote :5555

> It just seems like setting up environment and cwd could be done by a wrapper
> script or whatever invokes vgdb.

But who or what would create such a wrapper script?

Why not simply sent the gdb environment and cwd through QSetWorkingDir and QEnvironment* when we can detect the remote would like the environment to be like the gdb environment?

Comment 28 Tom Tromey 2023-07-13 14:14:16 UTC

I may not really understand the environment issue.

Using Q* to set the environment and the working directory seems
clearly better to me, and in fact the only way it can work
properly.  The problem case is something like:

(gdb) set env X y
(gdb) start
(gdb) add-inferior blah blah
(gdb) set env X z
(gdb) start

Here, we've made 2 inferiors and started them with different environments.
The native target can simply pass this stuff to the inferior via
ordinary setenv stuff.  For remotes, though, the inheritance can
only be done once, other inferiors are not fork'd by gdb.

If Q* aren't working on the gdb side, let's make a new bug and
make it block this one.  If they aren't working on the valgrind side,
I suggest a valgrind bug instead.

A bonus of implementing vFile and Q* in valgrind is that then
"target remote | ssh machine vgdb" will work, which seems nice.

Comment 29 Mark Wielaard 2023-07-13 14:26:58 UTC

(In reply to Tom Tromey from comment #28)
> I may not really understand the environment issue.
> 
> Using Q* to set the environment and the working directory seems
> clearly better to me, and in fact the only way it can work
> properly.  The problem case is something like:
> 
> (gdb) set env X y
> (gdb) start
> (gdb) add-inferior blah blah
> (gdb) set env X z
> (gdb) start
> 
> Here, we've made 2 inferiors and started them with different environments.
> The native target can simply pass this stuff to the inferior via
> ordinary setenv stuff.  For remotes, though, the inheritance can
> only be done once, other inferiors are not fork'd by gdb.

Note that vgdb doesn't implement multi-process mode. But it can easily start different processes (one after the other) with different environments. But I might not understand add-inferior correctly.

> If Q* aren't working on the gdb side, let's make a new bug and
> make it block this one.  If they aren't working on the valgrind side,
> I suggest a valgrind bug instead.

We have QSetWorkingDir working and started implementing QEnvironment*, but there is no way to ask gdb to sent them. The user has to explicit replicate the whole environment with gdb commands before gdb sents them to the remote. But maybe there is some way to request gdb to sent them explicitly?

> A bonus of implementing vFile and Q* in valgrind is that then
> "target remote | ssh machine vgdb" will work, which seems nice.

In target remote mode it is kind of hard to make it work transparently, because then gdb/vgdb expect an already setup valgrind running on that machine. But with target extended-remote | ssh machine vgdb --multi, yeah, that would be nice.

Comment 30 Pedro Alves 2023-07-13 16:35:10 UTC

> Using Q* to set the environment and the working directory seems
> clearly better to me, and in fact the only way it can work
> properly.  The problem case is something like:
>
> (gdb) set env X y
> (gdb) start
> (gdb) add-inferior blah blah
> (gdb) set env X z
> (gdb) start

GDB only sends the env vars that were set/cleared with "set/unset env".  It does not send the whole environment that GDB has.  Same for the working directory.  If you user hasn't done "set cwd" explicitly, then GDB will tell the server to use its default cwd.

That's why I was suggesting to just start vgdb with the environment you want, like if done by "target valgrind", then it would already inherit gdb's environment.  If started via "ssh server vgdb", then it wouldn't make sense to send the whole of gdb's environment.  So that's why I was asking what would be the use case, if users aren't ideally going to start vgdb themselves.

Comment 31 Pedro Alves 2023-07-13 16:38:31 UTC

> But if you think using vFile:open will make a good interface for detecting whether 
> the local filesystem == remote filesystem we can certainly implement it.

The suggestion to autodetect was actually more with normal gdbserver debugging in mind.  For valgrind, it would seem fine to me to have "target valgrind" pass "-local-filesystem" or whatever to explicitly tell GDB about it, no heuristics.

Comment 32 Pedro Alves 2023-07-13 17:43:41 UTC

> We prefer vgdb being launched from within gdb (target extended-remote | vgdb --multi) 
> and get the environment/cwd inherited that way.

> But currently that means vgdb will communicate through stdin/out with gdb, so 
> stdin/out aren't available for the inferior. This is being worked on by Sasha for 
> https://sourceware.org/bugzilla/show_bug.cgi?id=28916 at 
> https://git.sr.ht/~sasshka/binutils-gdb/log/split_fd but we aren't there yet 
> (switching the file descriptors used for communication seems to work, but the whole 
> terminal handling is a little complex).

Not sure if I fully understood that approach.

But, wouldn't it be simpler to teach GDB an alternative to "target extended-remote | CMD".

Like:

  target extended-remote -server-cmd "vgdb --multi :9999" :9999

"-server-cmd CMD" would spawn CMD, and GDB would wait/reap it like it does for "target extended-remote | CMD".

You could replace ":9999" with a unix domain socket, created by the "target valgrind" command, for example.

That would leave stdin/stdout/stderr to be used by gdbserver as normal.

Terminal handling, for getting input into the inferior, yeah, that is a little complex. You have to put vgdb in the foreground whenever gdb wants to give the terminal to the inferior.

Comment 33 Mark Wielaard 2023-07-14 15:15:51 UTC

(In reply to Pedro Alves from comment #32)
> > We prefer vgdb being launched from within gdb (target extended-remote | vgdb --multi) 
> > and get the environment/cwd inherited that way.
> 
> > But currently that means vgdb will communicate through stdin/out with gdb, so 
> > stdin/out aren't available for the inferior. This is being worked on by Sasha for 
> > https://sourceware.org/bugzilla/show_bug.cgi?id=28916 at 
> > https://git.sr.ht/~sasshka/binutils-gdb/log/split_fd but we aren't there yet 
> > (switching the file descriptors used for communication seems to work, but the whole 
> > terminal handling is a little complex).
> 
> Not sure if I fully understood that approach.

The basic idea is that gdb would dup the normal stdin/out/err and let the gdbserver/vgdb inherit those file descriptors. Then gdb will sent a new fdSwitch packet to notify that gdbserver/gdb should switch around its own stdin/out/err with those file descriptors (and make any inferiors started use those) and move the original stdin/out file descriptors so packets aren't sent anymore over stdin/out.

> But, wouldn't it be simpler to teach GDB an alternative to "target
> extended-remote | CMD".
> 
> Like:
> 
>   target extended-remote -server-cmd "vgdb --multi :9999" :9999
> 
> "-server-cmd CMD" would spawn CMD, and GDB would wait/reap it like it does
> for "target extended-remote | CMD".
> 
> You could replace ":9999" with a unix domain socket, created by the "target
> valgrind" command, for example.
> 
> That would leave stdin/stdout/stderr to be used by gdbserver as normal.

Andrew also suggested something like that (replace | with some other magic char) to get that effect. Maybe
 
> Terminal handling, for getting input into the inferior, yeah, that is a
> little complex. You have to put vgdb in the foreground whenever gdb wants to
> give the terminal to the inferior.

Maybe this part of the conversation should move to bug #28916
I think I am a little naive and was just hoping having a shared stdin between gdb/gdbserver/inferior would be enough to get the inferior normal reading from stdin. I didn't realize the terminal needed to know who is reading. But I already noticed it totally messes up ^C handling... hohum.

Having stdout work would already be an improvement. gdbserver has a trick for that which vgdb should probably just add too (redirect stdout to stderr for the inferior).
https://bugs.kde.org/show_bug.cgi?id=471311

Comment 34 Andrew Burgess 2023-08-25 15:50:33 UTC

The following patch is relevant to this bug discussion:

https://inbox.sourceware.org/gdb-patches/cover.1692977354.git.aburgess@redhat.com/

This includes a mechanism by which GDB can figure out if its running on the same machine as gdbserver.  The v2 series I linked does include some limited support for spotting gdbserver running in a container, but I think there's more work I need to do in this area (but it will be Sep before I get back to this work now).

Anyway, just wanted to leave a cross-reference here for anyone that's interested.