GDB with PCIe device

Simon Marchi simon.marchi@polymtl.ca
Fri Jan 8 15:17:41 GMT 2021


On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote:
> Hello,
> 
> As per my understanding, gdb calls ptrace system calls which intern uses
> kernel implementation of architecture specific action (updating debug
> registers,reading context memory...) to set breakpoints, and so on.
> 
> But in case of running gdb with PCIe devices such as gpu or fpga, how does
> the hardware specific actions are being done?
> 
> Should device drivers provide ptrace equivalent kernel implementation?
> 
>  Could any of the gdb gurus shed some light on debug software stacks in
> debugging software that runs on one of the mentioned pcie devices?
> 
> Thanks in advance,
> 

One such gdb port that is in development is ROCm-GDB, by AMD:

  https://github.com/ROCm-Developer-Tools/ROCgdb

It uses a helper library to debug the GPU threads:

  https://github.com/ROCm-Developer-Tools/ROCdbgapi

I don't want to get too much into how this library works, because I'm
sure I'll say something wrong / misleading.  You can look at the code.
But I'm pretty sure the GPU isn't debugged through ptrace.
The library communicates with the kernel driver somehow, however.

So, the GPU devices can use whatever debug interface, as long as a
corresponding target exist in GDB to communicate with it.

Today, one GDB can communicate with multiple debugging target, but only
with one target per inferior.  So you can be debugging a local program
while debugging another remote program.

In the GPU / coprocessor programming world, the model is often that you
run a program on the host, which spawns some threads on the GPU /
coprocessor.  From the point of view of the user, the threads on the host
and the threads on the GPU / coprocessor belong to the same program, so
would ideally appear in the same inferior.  ROCm-GDB does this, but it's
still done in a slightly hackish way, where the target that talks to the
GPU is installed in the "arch" stratum (this is GDB internal stuff) of
the inferior's target stack and hijacks the calls to the native Linux
target.

The better long term / general solution is probably to make GDB able to
connect to multiple debug targets for a single inferior.

Simon


More information about the Gdb mailing list