[RFC] Set process affinity in test to work around ARM ptrace bug

Antoine Tremblay antoine.tremblay@ericsson.com
Thu Jun 30 14:20:00 GMT 2016


Yao Qi writes:

> We recently found a ARM kernel ptrace bug
> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-May/431962.html
> As a result of this bug, after GDB ptrace set VFP registers, the hardware
> registers may not be updated.  This bug causes some intermittent fails in
> tests, like return.exp, call-rt-st.exp, callfuncs.exp, etc.
>
> The bug is fixed in ARM kernel tree, but it is impractical to upgrade
> linux kernel from git tree or most recently release (I don't know when
> the fix can be shipped in the mainline kernel release).  I am wondering
> we can workaround this kernel bug somehow.
>
> My first attempt is to workaround it in GDB, so that GDB still writes
> the VFP registers and sync them to hardware.  The kernel patch is quite
> simple, which moves vfp_flush_hwstate one line below.  Probably, we can
> call ptrace set vfp registers twice, and then the second vfp set can
> flush the state correctly.  Unfortunately, it doesn't work, because
> every time of ptrace set, kernel loads VFP registers from hardware first,
> which might be out of date after the first ptrace set.  That is to say,
> we can't workaround this kernel bug in GDB.
>
> Then, I am thinking we can workaround this bug in testing, because the
> intermittent fails are confusing in comparing test results, by binding
> both tracer and tracee on the same core.  For example, we can start GDB
> or GDBserver with "taskset -c 0 ", but this is a global change, may
> have some affects on gdb.threads tests.  I also think about doing
> "taskset -p PID -c 0" in test harness after the inferior is started,
> and do the same to the parent process of inferior (which is either GDB
> or GDBserver).
>
> The approach in this patch is to have a small c function which sets
> both process affinity and its parent's affinity to core 0.  This
> function should be called in these tests explicitly, but other tests
> are not affected at all.  This patch is posted to get comments on the
> necessity of workaround this kernel bug, and the proper to workaround
> this bug.  There are still some test cases affected by this kernel bug,
> but this patch doesn't touch them yet.
>

I like the idea, this has been a pain for a while however from my
testing there is a lot of intermitent tests and I'm not sure if this
ptrace fix fixes them all.

I think we just make sure that we don't hide other ptrace bugs so that
we can find them. I had another bug in the Odroid UX4 SoC causing
similar problems.

Also to consider is that this could apply to a lot of tests here's my
list of intermittent test from about 40 runs with Sergio's script:

argv0-symlink.exp array_bounds.exp array_ptr_renaming.exp
array_subscript_addr.exp auxv.exp bp-permanent.exp bp_enum_homonym.exp
bp_range_type.exp branch-to-self.exp break-precsave.exp
breakpoint-in-ro-region.exp catch_ex.exp char_enum.exp class2.exp
consecutive-precsave.exp converts.exp coredump-filter.exp dot_all.exp
exprs.exp fin_fun_out.exp finish-precsave.exp finish-reverse-bkpt.exp
finish-reverse.exp fixed_points.exp float_param.exp frame-args.exp
fstatat-reverse.exp fun_overload_menu.exp fun_renaming.exp
funcall_char.exp gcore-buffer-overflow.exp gcore-relro-pie.exp
gcore-relro.exp gcore.exp gdb-index.exp gdb1555.exp
getresuid-reverse.exp gnu-ifunc.exp gnu_vector.exp info-proc.exp
info-threads.exp interrupted-hand-call.exp jmisc.exp jprint.exp jump.exp
lang_switch.exp machinestate-precsave.exp mi_dyn_arr.exp
mi_interface.exp mi_task_arg.exp mi_task_info.exp mi_var_array.exp
multi-forks.exp next-while-other-thread-longjmps.exp operators.exp
optim_drec.exp out_of_line_in_inlined.exp pckd_arr_ren.exp
pipe-reverse.exp print-symbol-loading.exp print_chars.exp
process-dies-while-handling-bp.exp pthreads.exp py-strfns.exp python.exp
queue-signal.exp readv-reverse.exp rec_return.exp sigall-precsave.exp
sigall-reverse.exp siginfo-obj.exp siginfo-thread.exp skip-solib.exp
small_reg_param.exp solib-precsave.exp solib-reverse.exp str_uninit.exp
taft_type.exp task_bp.exp type_coercion.exp until-reverse.exp
watch-precsave.exp whatis_array_val.exp watch-bitfields.exp
packed_array.exp formatted_ref.exp vec_comps.exp solib-intra-step.exp
waitpid-reverse.exp mi-tsv-changed.exp"

These a from a few weeks ago and I think a lof of reverse tests may not
be valid... not sure still it's quite a list.

I'll retest with the patched kernel over the weekend see how many go
away...

Thanks for looking into this!
Antoine



More information about the Gdb-patches mailing list