This is the mail archive of the frysk@sources.redhat.com mailing list for the frysk project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: First try of breakpoint support


Quoting Mark Wielaard <mark@klomp.org>:

Yes, I also had some intermittent failures. That first try depended on
the timing of the ptrace calls to be right. I have fixed that by
integrating the code with the proc and task state machines. That makes
sure that the various ptrace calls are made at the right time. This also
makes the code much faster.

Great! Nice to know that it is fixed.


I also have some thought about general breakpoint support.

- the kprobe like mechanism to support multi-thread breakpoint seems
to me a very good idea.  Some candidate for holding the original
instruction I can thought of are: .init section (they are not needed
after the program starts up), ELF header or P-headers (storing these
information somewhere else, so no need to look into them again), or an
extra dynamically loaded library (through some kinds of dynamic code
patching technique)

Yes, the only issue would be locking because we want the areas to be only used by one thread at a time. Ideally we would have some per Task section. Unfortunately we cannot use something like the per-thread-stack since those are not executable these days. Loading an extra dynamically loaded library is clever. But that might be pretty intrusive on the process we monitor.

Yes, non-executable stack is a problem. Is there any way for the debugger to change the access attribute of the debuggee pages? AFAIK, ptrace can't do that. I am now thinking if utrace can do that. It seems to me not that hard. Utrace can register a hook function to do that in my opinion.


If this kind of mechanism is available, it might also help in watchpoint implementation. To set a write watchpoint on some address, we can set that page to be read-only, a page fault will trigger when a write attempt is tried on it, then debugger can fetch that and report a write hit; vice versa for read watchpoint. Any ideas?

Yes, loading extra dynamically loaded library is somewhat intrusive. In fact, many thing debugger are doing are intrusive in some aspect. The difference is the intrusive degree. What degree of intrusiveness can we accept?

- this kind of dynamic code patching technique might also be used for
fast conditional breakpoint.  To say this, I means something like
Dyninst (http://www.cs.umd.edu/projects/dyninstAPI/).  I know it is
mainly used in non-interactive dynamic instrumentation.  But I am not
sure if it is proper to use it in interavtive debugging. Maybe there
are some other technique feasible to do similar thing in interactive
debugging.  Any idea on how to implement fast conditional breakpoints
in Frysk?

I hadn't seen that yet. Unfortunately it isn't Free Software and can only be used for research purposes. But the idea to have a rich library of dynamic instrumentation code is nice. We will most likely need something that can generate conditional breakpoints by code patching at some point. But for now I have kept the design as easy as possible and just have unconditional breakpoints. There are a lot of issues to think about when multiple threads might have different conditionals on their breakpoints.

Yes. multi-thread debugging is always a hard problem. Maybe we can list first all the possible problem we might encounter. And then current available solution, their pros and cons...what support low level provide? what we need to ask low-level for? ... based on these, we might come up a better solution. Though there are really quite a lot of works to be done.


- I see that you create a new breakpoint by hardwiring to
LinuxIa32Breakpoint. We are thinking that breakpoint for LinuxPPC64
will be very similar except that we will use an illegal instruction
opcode, which will trigger a SIGTRAP as well. The later work flow is
the same, frysk.sys.Wait will detect this when inspecting the wait
status, and handleTrappedEvent will be called.  Is our understanding
correct?  Any thing we need to keep in mind when adding LinuxPPC64
support for this?  And we will appreciate it very much if you can
share your latest workable code with us.

I'll post the rewritten code asap. I redesigned the state machine part 3 times now and I am still not completely satisfied, but having the code out means more feedback. I made the Breakpoint class even simpler, ignoring any possible optimizations for now. That should make it even simpler to get it up and running on another architecture quickly. The things to consider with this simple approach are 'branch-delay-slots' if your processor has those. In that case the simple, set/reset trapping-instruction doesn't work if the target is a branch instruction since then an instruction from the branch delay slot just after the branch might be executed before the breakpoint is hit.

Thanks in advance for that. We will be very happy to go thru your code and to see if we can provide any feedback. And it is really awful for me to hear that you change the state machines three times (I am wishing that you don't change that too much so that I don't need to spend the same time I need to understand it the first time. :-).


For branch-delay-slot, I am not sure if power processors have that. We will check that.

- Hardware watchpoint hit will also delieve SIGTRAP, then watchpoint
might also share some work flow with breakpoint.  Do you have some
consideration in this aspect?

There doesn't seem to be a way to set those through something like ptrace.

On ppc64, there is an extra ptrace command which can do this. I am not very sure about the situation in i386 though.


So we would need kernel support for that since they can normally
only be set when running in kernel-privilege mode. But I haven't looked
into this yet. If we can support them they shouldn't work to differently
from how they work now by patching and resetting the original code with
trapping instructions. It would certainly be a faster.

- I remember you (or some one else) mentioned that system call tracing
can't co-exist with single step.  I don't have any knowledge about
this before.  Any reason for this?  Is there any testcase to confirm
this?  What about inserting a breakpoint in the next instruction
following system call to simulate stepping over system call?

There are 2 issues. The first is that the state machine support isn't integrated yet. Breakpoints and syscall tracing are similar, but different enough to make it non-trivial to use the same setup. The second is as you mention, single-stepping over a syscall instruction and interacting correctly with ptrace is a bit tricky. The other solution for the problem you mention above is to turn on syscall tracing whenever you set a breakpoint on a system call instruction. Both generate a SIGTRAP when ptrace is used. You can then just have a flag or a table that tells you how to interpret it (as syscall enter/leave or breakpoint or both). I haven't looked at how utrace can be used here.

I don't have testcases yet. Those should indeed be written asap.

Yes. A testcase can help understand this situation more easily. I am wishing someone can do that. :-)


Regards
- Wu Zhou


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]