[PATCH] Process record support for PowerPC

Fri Nov 28 12:15:00 GMT 2014

Wei-cheng Wang wrote:

> This patch is process record support for PowerPC

Very good, thanks for working on this!

> * Tested with Advance Toolchain 8.0 ppc64 - 2326 pass, no fail

Excellent!   Some suggestions for additional testing:
  - 32-bit mode (test using -m32) -- it seems some of your Linux-specific
    code may not work correctly in 32-bit mode
  - Different ISA levels (test using -mcpu=power8/power7/...) to exercise
    different instructions

> * 26 fails if tested with ppc64le, due to local entry optimization.
>    Breakpoints set on the very beginning of functions may
>    not be hit when reverse-continue.

I see.  There is this block in infrun.c:process_event_stop_test

      if (execution_direction == EXEC_REVERSE)
        {
          /* If we're already at the start of the function, we've either just
             stepped backward into a single instruction function without line
             number info, or stepped back out of a signal handler to the first
             instruction of the function without line number info.  Just keep
             going, which will single-step back to the caller.  */
          if (ecs->stop_func_start != stop_pc)
            {
              /* Set a breakpoint at callee's start address.
                 From there we can step once and be back in the caller.  */
              struct symtab_and_line sr_sal;

              init_sal (&sr_sal);
              sr_sal.pc = ecs->stop_func_start;
              sr_sal.pspace = get_frame_program_space (frame);
              insert_step_resume_breakpoint_at_sal (gdbarch,
                                                    sr_sal, null_frame_id);
            }
        }

which doesn't correctly handle multiple entry points.  ecs->stop_func_start
is already the local entry point (see fill_in_stop_func), so code should
already continue backwards until then.  However, code elsewhere in infrun.c
assumes that once this breakpoint is hit, we need just a single further step
to arrive back at the caller.  If we actually go through a global entry point
prologue, we may have to step multiple times.

This can certainly be fixed as a follow-on patch.

> * Support reverse-step over subroutine without debug info
> * Record/reverse syscall/signal
> * Support reverse-step through solib trampoline
> 
> And these instructions are handled
> * All main opcode
> * opcode 31
> -- Privileged instructions are listed, but not handled.
>     Shoule I just remote them?

Yes, it doesn't make much sense to list (some of) them.  See also below.

> * opcode 59 and 63 (Decimal Floating-Point and Floating-point)
> * opcode 4 (Only [vector] instructions are handled.)
> * opcode 19 (Only conditional and branch instruction.)
> 
> SPE, Embedded, LMA, VSX and privileged instructions are not handled.

The only one of those that would be of interest (on current server
platforms) would be VSX.  I see that you're actually already supporting
many VSX instructions, in particular loads and stores.  The one missing
piece is opcode 60.  It would be great if you could add that to complete
VSX support ... but this is not a requirement to get this patch approved,
we can always add it later.

Note that the patch seems to be missing ChangeLog entries.

Some detailed review comments follow.

> diff --git a/gdb/ppc-tdep.h b/gdb/ppc-tdep.h
> index 08554ff..437ca8c 100644
> --- a/gdb/ppc-tdep.h
> +++ b/gdb/ppc-tdep.h
> @@ -259,6 +259,8 @@ struct gdbarch_tdep
>       /* ISA-specific types.  */
>       struct type *ppc_builtin_type_vec64;
>       struct type *ppc_builtin_type_vec128;
> +
> +    int (* syscall_record) (struct regcache *regcache);
No space after the '*'.  Also, for better searchability,
the field should probably be named ppc_syscall_record.

>+extern int ppc64_process_record (struct gdbarch *gdbarch,
>+                                struct regcache *regcache, CORE_ADDR addr);
A note on naming: there doesn't really seem to be anything 64-bit specific 
in ppc64_process_record or its subroutines.  The usual naming convention
would then be to use ppc_process_record.

> diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
> index 2c79bc1..77bd462 100644
> --- a/gdb/testsuite/lib/gdb.exp
> +++ b/gdb/testsuite/lib/gdb.exp
> @@ -1881,7 +1881,8 @@ proc supports_process_record {} {
>       }
> 
>       if { [istarget "arm*-*-linux*"] || [istarget "x86_64-*-linux*"]
> -         || [istarget "i\[34567\]86-*-linux*"] } {
> +         || [istarget "i\[34567\]86-*-linux*"]
> +         || [istarget "powerpc*-linux*"] } {
Should be powerpc*-*-linux*

> @@ -1897,7 +1898,8 @@ proc supports_reverse {} {
>       }
> 
>       if { [istarget "arm*-*-linux*"] || [istarget "x86_64-*-linux*"]
> -         || [istarget "i\[34567\]86-*-linux*"] } {
> +         || [istarget "i\[34567\]86-*-linux*"]
> +         || [istarget "powerpc*-linux*"] } {
Likewise.

> diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
> index a997869..608081c 100644
> --- a/gdb/ppc-linux-tdep.c
> +++ b/gdb/ppc-linux-tdep.c
> @@ -49,6 +49,9 @@
>   #include "spu-tdep.h"
>   #include "xml-syscall.h"
>   #include "linux-tdep.h"
> +#include "linux-record.h"
> +#include "record-full.h"
> +#include "infrun.h"
What is this (infrun.h) needed for?

> @@ -298,6 +301,23 @@ powerpc_linux_in_dynsym_resolve_code (CORE_ADDR pc)
>   	  || strcmp (MSYMBOL_LINKAGE_NAME (sym.minsym),
>   		     "__glink_PLTresolve") == 0))
>       return 1;
> +  else if (sym.minsym != NULL && execution_direction == EXEC_REVERSE)
> +    {
> +      /* When reverse stepping, gdb needs to know whether PC lies in
> +	 the dynamic symbol resolve code, so it can keep going until
> +	 reaching some user code.
> +	 Using insns-match-pattern is not suitable, because we had to
> +	 look both ahead and behind to check where we are in the middle
> +	 of one of trampline sequences.  */
> +#define SUBSTRCMP(sym, stub)  (memcmp (sym + 8, stub, sizeof (stub) - 1) == 0)
> +      if (SUBSTRCMP (MSYMBOL_LINKAGE_NAME (sym.minsym), ".plt_call."))
> +	return 1;
> +      if (SUBSTRCMP (MSYMBOL_LINKAGE_NAME (sym.minsym), ".plt_branch."))
> +	return 1;
> +      if (SUBSTRCMP (MSYMBOL_LINKAGE_NAME (sym.minsym), ".plt_branch_r2off."))
> +	return 1;
> +#undef SUBSTRCMP

Huh.  The "sym + 8" seems odd.  What if symbol name is shorter than 8 bytes?

Also, those stub symbols are not guaranteed to be present.

Can't we generalize the skip_trampoline_code so that it also accepts
a PC in the middle of the sequence?  That would seem a more general
solution.

> @@ -1345,6 +1524,15 @@ ppc_linux_init_abi (struct gdbarch_info info,
>         set_solib_svr4_fetch_link_map_offsets
>           (gdbarch, svr4_lp64_fetch_link_map_offsets);
> 
> +      if (powerpc_so_ops.in_dynsym_resolve_code == NULL)
> +	{
> +	  powerpc_so_ops = svr4_so_ops;
> +	  /* Override dynamic resolve function.  */
> +	  powerpc_so_ops.in_dynsym_resolve_code =
> +	    powerpc_linux_in_dynsym_resolve_code;
> +	}
> +      set_solib_ops (gdbarch, &powerpc_so_ops);
> +

If we actually need this, we shouldn't copy this part in both the
wordsize == 4 and wordsize == 8 paths, but just have it once.

Also, this part should probably be broken out into a separate patch.

> +/* PPC process record-replay */
> +
> +struct linux_record_tdep ppc_linux_record_tdep;
> +
> +static enum gdb_syscall
> +ppc_canonicalize_syscall (int syscall)
> +{
> +  /* See arch/powerpc/include/uapi/asm/unistd.h */
> +
> +  if (syscall <= 165)
> +    return syscall;
> +  else if (syscall >= 167 && syscall <= 190)	/* Skip query_module 166 */
> +    return syscall + 1;
> +  else if (syscall >= 192 && syscall <= 197)	/* mmap2 */
> +    return syscall;
> +  else if (syscall == 208)			/* tkill */
> +    return gdb_sys_tkill;
> +  else if (syscall >= 207 && syscall <= 220)	/* gettid */
> +    return syscall + 224 - 207;
> +  else if (syscall >= 234 && syscall <= 239)	/* exit_group */
> +    return syscall + 252 - 234;
> +  else if (syscall >= 240 && syscall <=248)	/* timer_create */
> +    return syscall += 259 - 240;
> +  else if (syscall >= 250 && syscall <=251)	/* tgkill */
> +    return syscall + 270 - 250;
> +  else if (syscall == 336)
> +    return gdb_sys_recv;
> +  else if (syscall == 337)
> +    return gdb_sys_recvfrom;
> +  else if (syscall == 342)
> +    return gdb_sys_recvmsg;

This seems a somewhat partial mapping; I guess that's good enough for now.
Longer term, we may want to look into automatically generating the mapping;
we also have another list of platform syscalls in the syscalls/ directory ...

> +/* Record all registers but PC register for process-record.  */
> +
> +static int
> +ppc_all_but_pc_registers_record (struct regcache *regcache)
> +{
> +  struct gdbarch *gdbarch = get_regcache_arch (regcache);
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> +  int i;
> +
> +  for (i = 0; i < 32; i++)
> +    {
> +      if (record_full_arch_list_add_reg (regcache, tdep->ppc_gp0_regnum + i))
> +        return -1;
> +    }
> +
> +  if (record_full_arch_list_add_reg (regcache, tdep->ppc_cr_regnum))
> +    return -1;
> +  if (record_full_arch_list_add_reg (regcache, tdep->ppc_lr_regnum))
> +    return -1;
> +  if (record_full_arch_list_add_reg (regcache, tdep->ppc_ctr_regnum))
> +    return -1;

This of course doen't actually record *all* registers, just the general
purpose ones.  Depending on the usage, this may or may not be appropriate.

> +static int
> +ppc_linux_syscall_record (struct regcache *regcache)
> +{
> +  struct gdbarch *gdbarch = get_regcache_arch (regcache);
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> +  ULONGEST scnum;
> +  enum gdb_syscall syscall_gdb;
> +  int ret;
> +
> +  regcache_raw_read_unsigned (regcache, tdep->ppc_gp0_regnum, &scnum);
> +  syscall_gdb = ppc_canonicalize_syscall (scnum);
> +
> +  if (syscall_gdb < 0)
> +    {
> +      printf_unfiltered (_("Process record and replay target doesn't "
> +                           "support syscall number %d\n"),
> +                           (int) scnum);
> +      return 0;
> +    }
> +
> +  if (syscall_gdb == gdb_sys_sigreturn
> +      || syscall_gdb == gdb_sys_rt_sigreturn)
> +   {
> +     if (ppc_all_but_pc_registers_record (regcache))
> +       return -1;

For example, here, we might actually have to record *all* registers, since
a sigreturn syscall will in fact reload not just the GPRs, but also float
and vector registers.

> +static int
> +ppc64_linux_record_signal (struct gdbarch *gdbarch,
> +                           struct regcache *regcache,
> +                           enum gdb_signal signal)
> +{
> +  /* See arch/powerpc/kernel/signal_64.c
> +	 arch/powerpc/include/asm/ptrace.h
> +     for details.  */
> +  const int SIGNAL_FRAMESIZE = 128;
> +  const int sizeof_rt_sigframe = 1440 * 2 + 8 * 2 + 4 * 6 + 8 + 8 + 128 + 512;

This is too big for 32-bit, but that may not matter.  (See also below.)

> +  ULONGEST sp;
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> +
> +  if (ppc_all_but_pc_registers_record (regcache))
> +    return -1;

I'm not sure why we need this here; signal delivery does not actually change
all registers, it only changes those affected by a function call (possibly
0, 11, 12, some argument registers, LR, CTR, SP).

> +  if (record_full_arch_list_add_reg (regcache, gdbarch_pc_regnum (gdbarch)))
> +    return -1;
> +
> +  /* Record the change in the stack.
> +     frame-size = sizeof (struct rt_sigframe) + SIGNAL_FRAMESIZE  */
> +  regcache_raw_read_unsigned (regcache, gdbarch_sp_regnum (gdbarch), &sp);
> +  sp -= SIGNAL_FRAMESIZE;
> +  sp -= sizeof_rt_sigframe;
> +
> +  if (record_full_arch_list_add_mem (sp, SIGNAL_FRAMESIZE + sizeof_rt_sigframe))
> +    return -1;

I see x86 does this too, but I wonder why.  All this memory is in fact
uninitialized before the signal, so what's the point of tracking its original
contents?

> +  /* Support reverse debugging.  */
> +  set_gdbarch_process_record (gdbarch, ppc64_process_record);
> +  set_gdbarch_process_record_signal (gdbarch, ppc64_linux_record_signal);
> +  tdep->syscall_record = ppc_linux_syscall_record;

At this place in ppc_linux_init_abi, we are handling *both* 32-bit and
64-bit code.  This means the callbacks installed here should handle
both bitsizes.  That seems OK for ppc64_process_record, but probably
not for ppc64_linux_record_signal and ppc_linux_syscall_record (in
their current implementation).

> +  /* Initialize the ppc_linux_record_tdep.  */
> +  /* These values are the size of the type that will be used in a system
> +     call.  They are obtained from Linux Kernel source.
> +
> +     See arch/powerpc/include/uapi/asm/ioctls.h.  */
> +  ppc_linux_record_tdep.size_pointer
> +    = gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT;

This is problematic.  There is only one single instance of
ppc_linux_record_tdep, so when this code is called for 32-bit
and then for 64-bit, that single instance gets overwritten.
I think we really need two copies of the structure, and
install the one appropriate for the current bitsize.

Note that many of the hard-coded "size" values in your
patch are only correct for 64-bit, so the 32-bit copy
of the struct would have to use different values.
(The non-"size" values should be OK for both.)

> +  ppc_linux_record_tdep.size_old_gid_t = 2;
> +  ppc_linux_record_tdep.size_old_uid_t = 2;
These should be 4 on PowerPC (never used short).

> +  ppc_linux_record_tdep.size_stat64 = 144;
This is 104 on PowerPC (different padding).

> +  ppc_linux_record_tdep.size_PAGE_SIZE = 4096;
This is a bit of a problem.  The PowerPC kernel can be compiled
with either 4k or 64k page size.  We could either try to detect
at runtime (from auxv) or always use the conservative 64k.

> +  ppc_linux_record_tdep.size_user_desc = 16;
PowerPC doesn't have this system call at all.

> +  ppc_linux_record_tdep.size_epoll_event = 12;
This is 16 on PowerPC (due to different packing).

> +  ppc_linux_record_tdep.size_termios = 60;
> +  ppc_linux_record_tdep.size_termios2 = 44;
PowerPC doesn't have termios2, and the size of termios is 44.

> +  /* These values are the second argument of system call "sys_ioctl".
> +     They are obtained from Linux Kernel source.  */
Many of these are different on PowerPC, see the private file
arch/powerpc/include/uapi/asm/ioctls.h vs. the generic one
include/uapi/asm-generic/ioctls.h.

> +  ppc_linux_record_tdep.ioctl_TCGETS = 0x5401;
> +  ppc_linux_record_tdep.ioctl_TCSETS = 0x5402;
> +  ppc_linux_record_tdep.ioctl_TCSETSW = 0x5403;
> +  ppc_linux_record_tdep.ioctl_TCSETSF = 0x5404;
> +  ppc_linux_record_tdep.ioctl_TCGETA = 0x5405;
> +  ppc_linux_record_tdep.ioctl_TCSETA = 0x5406;
> +  ppc_linux_record_tdep.ioctl_TCSETAW = 0x5407;
> +  ppc_linux_record_tdep.ioctl_TCSETAF = 0x5408;
> +  ppc_linux_record_tdep.ioctl_TCSBRK = 0x5409;
> +  ppc_linux_record_tdep.ioctl_TCXONC = 0x540A;
> +  ppc_linux_record_tdep.ioctl_TCFLSH = 0x540B;
ioctl_TCGETS = 0x402C7413
ioctl_TCSETS = 0x802C7414
ioctl_TCSETSW = 0x802C7415
ioctl_TCSETSF = 0x802C7416
ioctl_TCGETA = 0x40147417
ioctl_TCSETA = 0x80147418
ioctl_TCSETAW = 0x80147419
ioctl_TCSETAF = 0x8014741C
ioctl_TCSBRK = 0x2000741D
ioctl_TCXONC = 0x2000741E
ioctl_TCFLSH = 0x2000741F

> +  ppc_linux_record_tdep.ioctl_TIOCGPGRP = 0x540F;
> +  ppc_linux_record_tdep.ioctl_TIOCSPGRP = 0x5410;
> +  ppc_linux_record_tdep.ioctl_TIOCOUTQ = 0x5411;
ioctl_TIOCGPGRP = 0x40047477
ioctl_TIOCSPGRP = 0x80047476
ioctl_TIOCOUTQ = 0x40047473

> +  ppc_linux_record_tdep.ioctl_TIOCGWINSZ = 0x5413;
> +  ppc_linux_record_tdep.ioctl_TIOCSWINSZ = 0x5414;
ioctl_TIOCGWINSZ = 0x40087468
ioctl_TIOCSWINSZ = 0x80087467

> +  ppc_linux_record_tdep.ioctl_FIONREAD = 0x541B;
ioctl_FIONREAD = 0x4004667F

> +  ppc_linux_record_tdep.ioctl_FIONBIO = 0x5421;
ioctl_FIONBIO = 0x8004667E

> +  ppc_linux_record_tdep.ioctl_TCGETS2 = 0x802c542a;
> +  ppc_linux_record_tdep.ioctl_TCSETS2 = 0x402c542b;
> +  ppc_linux_record_tdep.ioctl_TCSETSW2 = 0x402c542c;
> +  ppc_linux_record_tdep.ioctl_TCSETSF2 = 0x402c542d;
These don't exist at all; PowerPC has no termios2.

> +  ppc_linux_record_tdep.ioctl_TIOCGPTN = 0x80045430;
> +  ppc_linux_record_tdep.ioctl_TIOCSPTLCK = 0x40045431;
> +  ppc_linux_record_tdep.ioctl_FIONCLEX = 0x5450;
> +  ppc_linux_record_tdep.ioctl_FIOCLEX = 0x5451;
> +  ppc_linux_record_tdep.ioctl_FIOASYNC = 0x5452;
ioctl_TIOCGPTN = 0x40045430
ioctl_TIOCSPTLCK = 0x80045431
ioctl_FIONCLEX = 0x20006602
ioctl_FIOCLEX = 0x20006601
ioctl_FIOASYNC = 0x8004667D

> +  ppc_linux_record_tdep.ioctl_FIOQSIZE = 0x5460;
ioctl_FIOQSIZE = 0x40086680

> +static const struct frame_unwind rs6000_epilogue_frame_unwind =
> +{
> +  NORMAL_FRAME,
> +  default_frame_unwind_stop_reason,
> +  rs6000_epilogue_frame_this_id, rs6000_epilogue_frame_prev_register,
> +  NULL,
> +  rs6000_epilogue_frame_sniffer
> +};

Is this actually still necessary with a current compiler that tracks
epilogues in DWARF?   In any case, it would be better to provide this
in a separate patch.

> +#define PPC_BF(insn)	PPC_FIELD (insn, 6, 3)
This seems unused.

> +static int
> +ppc_record_vsr (struct regcache *regcache, struct gdbarch_tdep *tdep, int vsr,
> +		int size)
> +{
> +  if (vsr < 0 || vsr >= 64)
> +    return -1;
> +
> +  if (vsr >= 32)
> +    {
> +      if (tdep->ppc_vr0_regnum >= 0)
> +	record_full_arch_list_add_reg (regcache, tdep->ppc_vr0_regnum + vsr);
Should be vsr - 32 here.

> +  else
> +    {
> +      if (tdep->ppc_fp0_regnum >= 0)
> +	record_full_arch_list_add_reg (regcache, tdep->ppc_fp0_regnum + vsr);
> +      if (size > 8 && tdep->ppc_vsr0_upper_regnum >= 0)
We don't really need the size check; see below.
> +	record_full_arch_list_add_reg (regcache,
> +				       tdep->ppc_vsr0_upper_regnum + vsr);

> +/* Parse instructions of primary opcode-4.  */
> +
> +static int
> +ppc64_process_record_op4 (struct gdbarch *gdbarch, struct regcache *regcache,
> +			   CORE_ADDR addr, uint32_t insn)
> +{

> +    case 786:		/* Vector Add Signed Byte Saturate */
Typo: should be 768
> +    case 940:		/* Vector Multiply Even Signed Word */
Typo: should be 904
> +    case 864:		/* Vector Shift Right Algebraic Doubleword */
Typo: should be 964

> +    case 32:		/* Vector Multiply-High-Add Signed Halfword Saturate */
> +    case 33:		/* Vector Multiply-High-Round-Add Signed Halfword Saturate */
> +    case 39:		/* Vector Multiply-Sum Unsigned Halfword Saturate */
> +    case 41:		/* Vector Multiply-Sum Signed Halfword Saturate */
> +    case 42:		/* Vector Select */
> +    case 43:		/* Vector Permute */
> +    case 44:		/* Vector Shift Left Double by Octet Immediate */
> +    case 60:		/* Vector Add Extended Unsigned Quadword Modulo */
> +    case 61:		/* Vector Add Extended & write Carry Unsigned Quadword */
> +    case 62:		/* Vector Subtract Extended Unsigned Quadword Modulo */
> +    case 63:		/* Vector Subtract Extended & write Carry Unsigned Quadword */
> +    case 34:		/* Vector Multiply-Low-Add Unsigned Halfword Modulo */
> +    case 36:		/* Vector Multiply-Sum Unsigned Byte Modulo */
> +    case 37:		/* Vector Multiply-Sum Mixed Byte Modulo */
> +    case 38:		/* Vector Multiply-Sum Unsigned Halfword Modulo */
> +    case 40:		/* Vector Multiply-Sum Signed Halfword Modulo */
> +    case 46:		/* Vector Multiply-Add Single-Precision */
> +    case 47:		/* Vector Negative Multiply-Subtract Single-Precision */
All of those use a VRC field, and therefore need to be treated like the case 45
at the beginning of this function!

> +			/* 5.16 Decimal Integer Arithmetic Instructions */
Headline superfluous at this point.

> +static int
> +ppc64_process_record_op19 (struct gdbarch *gdbarch, struct regcache *regcache,
> +			   CORE_ADDR addr, uint32_t insn)
> +{

> +    case 16:		/* Branch Conditional */
> +    case 560:		/* Branch Conditional to Branch Target Address Register */
> +      if (PPC_BO (insn) & 0x2)
> +	record_full_arch_list_add_reg (regcache, tdep->ppc_ctr_regnum);
This should be "if ((PPC_BO (insn) & 0x4) == 0)".  BO_2 is the bit with
value 4 in BO, and the instruction decrements CRT if BO_2 is *clear*.

> +static int
> +ppc64_process_record_op31 (struct gdbarch *gdbarch, struct regcache *regcache,
> +			   CORE_ADDR addr, uint32_t insn)
> +{

> +  CORE_ADDR at_dcsz, at_icsz, ea = 0;
at_icsz is unused?

> +  switch (ext)
> +    {
> +    case 78:		/* Determine Leftmost Zero Byte */
> +      /* CA is always altered, but SO/OV are only altered when OE=1.
> +	 In any case, XER is always altered.  */
The comment is wrong, there's no OE bit here.  Code is correct.

> +    case 339:          /* Move From Special Purpose Register */
Maybe add case 371 Move From Time Base here.  The instruction is
phased out, but potentially still used in some code.

> +    /* These write CR and optional RA.  */
> +    case 792:		/* Shift Right Algebraic Word */
> +    case 794:		/* Shift Right Algebraic Doubleword */
> +    case 824:		/* Shift Right Algebraic Word Immediate */
> +    case 826:		/* Shift Right Algebraic Doubleword Immediate (413) */
> +    case 826 | 1:	/* Shift Right Algebraic Doubleword Immediate (413) */
All of these also write CA (i.e. XER).

> +    case 853:		/* Load Byte and Zero Caching Inhibited Indexed */
> +    case 821:		/* Load Halfword and Zero Caching Inhibited Indexed */
> +    case 789:		/* Load Word and Zero Caching Inhibited Indexed */
> +    case 885:		/* Load Doubleword Caching Inhibited Indexed */
These are (hypervisor) privileged, so we don't really need to support them here.

> +    case 597:		/* Load String Word Immediate */
> +    case 533:		/* Load String Word Indexed */
> +      if (ext == 597)
> +	nr = PPC_NB (insn);
I guess this needs a "if (nr == 0) nr = 32" or similar here.
> +      else
> +	{
> +	  regcache_raw_read_unsigned (regcache, tdep->ppc_xer_regnum, &xer);
> +	  nr = PPC_XER_NB (xer);
If nr is zero here, the architecture says that RT is undefined, so we probably
should set nr to one in that case.
> +	}

> +    case 276:		/* Load Quadword And Reserve Indexed */
> +      tmp = (tdep->ppc_gp0_regnum + PPC_RT (insn)) & ~1;
Probably better to apply the & ~1 to PPC_RT (insn) ...
> +      record_full_arch_list_add_reg (regcache, tmp);
> +      record_full_arch_list_add_reg (regcache, tmp | 1);
... and use +1 here.  Otherwise you require that tdep->ppc_gp0_regnum
is even, which is currently true but not necessarily guaranteed.

> +    case 791:		/* Load Floating-Point Double Pair Indexed */
> +      tmp = (tdep->ppc_fp0_regnum + PPC_FRT (insn)) & ~1;
Likewise.
> +      record_full_arch_list_add_reg (regcache, tmp);
> +      record_full_arch_list_add_reg (regcache, tmp | 1);

> +    /* These write VSR of size 8.  */
> +    case 588:		/* Load VSX Scalar Doubleword Indexed */
> +      ppc_record_vsr (regcache, tdep, PPC_XT (insn), 8);
Note that the contents of the other doubleword element are undefined
after this instruction, so you need to save all of it, i.e. size == 16.
In fact, we probably don't even need the size argument then.

> +    /* Store memory.  */

> +    case 981:          /* Store Byte Caching Inhibited Indexed */
> +    case 949:          /* Store Halfword Caching Inhibited Indexed */
> +    case 917:          /* Store Word Caching Inhibited Indexed */
> +    case 1013:         /* Store Doubleword Caching Inhibited Indexed */
These are (hypervisor) privileged, so we don't really need to support them here.

> +       case 247:       /* Store Byte with Update Indexed */
> +       case 135:       /* Store Vector Element Byte Indexed */
> +       case 215:       /* Store Byte Indexed */
> +       case 694:       /* Store Byte Conditional Indexed */
> +       case 662:       /* Store Word Byte-Reverse Indexed */
> The above is wrong, it should have size 4.
> +       case 981:       /* Store Byte Caching Inhibited Indexed */
> +         size = 1;
> +         break;

> +       case 181:       /* Store Doubleword with Update Indexed */
> +       case 716:       /* Store VSX Scalar Doubleword Indexed */
> +       case 149:       /* Store Doubleword Indexed */
> +       case 214:       /* Store Doubleword Conditional Indexed */
> +       case 660:       /* Store Doubleword Byte-Reverse Indexed */
> +       case 1013:      /* Store Doubleword Caching Inhibited Indexed */
Missing cases for size 8: 727 and 759 (store floating-point double).
> +         size = 8;
> +         break;

> +      /* Align address for Store Vector instructions.  */
> +      if ((ext & 0x1f) == 0x7)
> +	{
> +	  if ((ext & 0x3) < 3)
This seems incorrect.  If (ext & 0x1f) == 0x7, then this check is never true.
There's just a few store vector instructions; maybe just making it a switch
would be more straightforward ...
> +	    addr = (addr >> (ext & 0x3)) << (ext & 0x3);
> +	  else
> +	    addr = addr & ~0x7ULL;
> +	}

> +    case 725:		/* Store String Word Immediate */
> +    case 661:		/* Store String Word Indexed */
> +      ra = 0;
> +      if (PPC_RA (insn) != 0)
> +	regcache_raw_read_unsigned (regcache, tdep->ppc_xer_regnum, &ra);
> +      ea += ra;
> +
> +      if (ext == 725)
> +	nb = PPC_NB (insn);
I guess this needs a "if (nr == 0) nr = 32" or similar here.
> +      else
> +	{
> +	  regcache_raw_read_unsigned (regcache, tdep->ppc_xer_regnum, &xer);
> +	  nb = PPC_XER_NB (xer);
Do we have to handle the nb == 0 case specially?  Doesn't write memory.
> +
> +	  regcache_raw_read_unsigned (regcache, tdep->ppc_xer_regnum, &rb);
> +	  ea += rb;
> +	}

> +    case 467:		/* Move To Special Purpose Register */
> +      switch (PPC_SPR (insn))
> +	{
> +	case 1:			/* XER */
> +	  record_full_arch_list_add_reg (regcache, tdep->ppc_xer_regnum);
> +	  return 0;
> +	case 8:			/* LR */
> +	  record_full_arch_list_add_reg (regcache, tdep->ppc_lr_regnum);
> +	  return 0;
> +	case 9:			/* CTR */
> +	  record_full_arch_list_add_reg (regcache, tdep->ppc_ctr_regnum);
> +	  return 0;
We should also at least support VRSAVE (256).

> +    case 18:		/* TLB Invalidate Local Indexed */
> +    case 326:		/* Data Cache Read */
> +    case 63:		/* Data Cache Block Store by External PID */
> +    case 127:		/* Data Cache Block Flush by External PID */
> +    case 134:		/* Data Cache Block Touch for Store and Lock Set */
> +    case 166:		/* Data Cache Block Touch and Lock Set */
> +    case 255:		/* Data Cache Block Touch for Store by External PID */
> +    case 319:		/* Data Cache Block Touch by External PID */
> +    case 390:		/* Data Cache Block Lock Clear */
> +    case 422:		/* Data Cache Block Lock Query */
> +    case 454:		/* Data Cache Invalidate */
> +    case 998:		/* Instruction Cache Read */
> +    case 966:		/* Instruction Cache Invalidate */
> +    case 982:		/* Instruction Cache Block Invalidate */
This is actually not privileged; it should just be ignored.
> +    case 198:		/* Instruction Cache Block Lock Query */
> +    case 230:		/* Instruction Cache Block Lock Clear */
> +    case 486:		/* Instruction Cache Block Touch and Lock Set */
> +    case 206:		/* Message Send */
> +    case 238:		/* Message Clear */
> +    case 142:		/* Message Send Privileged */
> +    case 174:		/* Message Clear Privileged */
> +    case 131:		/* Write MSR External Enable */
> +    case 163:		/* Write MSR External Enable Immediate */
> +    case 270:		/* Embedded Hypervisor Privilege */
> +    case 462:		/* Move To Performance Monitor Register */
This is not (generally) privileged; it's embedded only, but you already support
the "From" variant above ...
> +    case 494:		/* Move To Thread Management Register */
> +    case 807:		/* Store Vector by External Process ID Indexed */
> +    case 775:		/* Store Vector by External Process ID Indexed LRU */
> +    case 95:		/* Load Byte by External Process ID Indexed */
> +    case 287:		/* Load Halfword by External Process ID Indexed */
> +    case 31:		/* Load Word by External Process ID Indexed */
> +    case 29:		/* Load Doubleword by External Process ID Indexed */
> +    case 295:		/* Load Vector by External Process ID Indexed */
> +    case 263:		/* Load Vector by External Process ID Indexed LRU */
> +    case 259:		/* Move From Device Control Register Indexed */
> +    case 323:		/* Move From Device Control Register */
> +    case 146:		/* Move To Machine State Register */
> +    case 178:		/* Move To Machine State Register Doubleword */
> +    case 387:		/* Move To Device Control Register Indexed */
> +    case 419:		/* Move To Device Control Register User-mode Indexed */
> +    case 291:		/* Move From Device Control Register User-mode Indexed */
The two instructions above are in fact unprivileged.  But they are embedded-only,
so I guess it would be OK to not support them (but the error shouldn't say "privileged").
> +    case 451:		/* Move To Device Control Register */
> +      /* Privileged instructions.  */
> +      fprintf_unfiltered (gdb_stdlog, "Cannot record privileged instructions. "
> +			  "%08x at %08lx, 31-%d.\n", insn, addr, ext);
In general, I don't think this error makes much sense -- the list above does
not even contain all op31 privileged instructions, and we don't have any handling
for non-op31 privileged instructions.  It would probably be better to drop the
specical handling here, and just use the generic error message.

> +    case 654:		/* Transaction Begin */
> +    case 686:		/* Transaction End */
> +    case 718:		/* Transaction Check */
> +    case 750:		/* Transaction Suspend or Resume */
> +    case 782:		/* Transaction Abort Word Conditional */
> +    case 814:		/* Transaction Abort Doubleword Conditional */
> +    case 846:		/* Transaction Abort Word Conditional Immediate */
> +    case 878:		/* Transaction Abort Doubleword Conditional Immediate */
> +    case 910:		/* Transaction Abort */
> +    case 942:		/* Transaction Reclaim */
> +    case 1006:		/* Transaction Recheckpoint */
> +      fprintf_unfiltered (gdb_stdlog, "Cannot record Transaction instructions. "
> +			  "%08x at %08lx, 31-%d.\n", insn, addr, ext);
This seems OK, though.  Transactional instructions can occur, and we certainly cannot
handle them, and it is good to inform the user about that.  (Maybe not the last two,
those are privileged.)

> +static int
> +ppc64_process_record_op60 (struct gdbarch *gdbarch, struct regcache *regcache,
> +                          CORE_ADDR addr, uint32_t insn)
> +{
> +  int ext = PPC_EXTOP (insn);
> +
> +  fprintf_unfiltered (gdb_stdlog, "Warning: Don't know how to record "
> +                     "%08x at %08lx, 60-%d.\n", insn, addr, ext);
Since those may actually occur (frequently) in user code, we should probably
reword the message to something along the lines of "VSX instructions not
yet supported".  (Or simply implement them :-))

> +ppc64_process_record_op63 (struct gdbarch *gdbarch, struct regcache *regcache,
> +			   CORE_ADDR addr, uint32_t insn)
> +{

> +    case 290:		/* DFP Convert To Fixed */
This only writes a single register FPT, not a pair.

> +      tmp = (tdep->ppc_fp0_regnum + PPC_FRT (insn)) & ~1;
Probably better to apply the & ~1 to PPC_RT (insn) ...
> +      record_full_arch_list_add_reg (regcache, tmp);
> +      record_full_arch_list_add_reg (regcache, tmp | 1);
... and use +1 here.  Otherwise you require that tdep->ppc_fp0_regnum
is even, which is currently true but not necessarily guaranteed.

> +    case 66:		/* DFP Shift Significand Left Immediate */
> +    case 98:		/* DFP Shift Significand Right Immediate */
> +    case 322:		/* DFP Decode DPD To BCD */
> +    case 866:		/* DFP Insert Biased Exponent */
But these four write a register pair, not just a single register.

> +    case 38:		/* Move To FPSCR Bit 1 */
> +    case 70:		/* Move To FPSCR Bit 0 */
> +    case 134:		/* Move To FPSCR Field Immediate */
> +    case 711:		/* Move To FPSCR Fields */
These four only write to CR if RC is set.

> +    case 23:		/* Floating Select */
> +    case 583:		/* Move From FPSCR */
These two write to FRT.

> +int
> +ppc64_process_record (struct gdbarch *gdbarch, struct regcache *regcache,
> +		      CORE_ADDR addr)
> +{

> +    case 17:		/* System call */
Should we verify that the LEV field is zero here?

> +    case 16:		/* Branch Conditional */
> +      if (PPC_BO (insn) & 0x2)
> +	record_full_arch_list_add_reg (regcache, tdep->ppc_ctr_regnum);
This should be "if ((PPC_BO (insn) & 0x4) == 0)".  BO_2 is the bit with
value 4 in BO, and the instruction decrements CRT if BO_2 is *clear*.

> +    case 20:		/* Rotate Left Word Immediate then Mask Insert */
> +    case 21:		/* Rotate Left Word Immediate then AND with Mask */
> +    case 23:		/* Rotate Left Word then AND with Mask */
> +    case 30:		/* Rotate Left Doubleword Immediate then Clear Left */
> +			/* Rotate Left Doubleword Immediate then Clear Right */
> +			/* Rotate Left Doubleword Immediate then Clear */
> +			/* Rotate Left Doubleword then Clear Left */
> +			/* Rotate Left Doubleword then Clear Right */
> +			/* Rotate Left Doubleword Immediate then Mask Insert */
> +      record_full_arch_list_add_reg (regcache,
> +				     tdep->ppc_gp0_regnum + PPC_RA (insn));
> +      record_full_arch_list_add_reg (regcache, tdep->ppc_cr_regnum);
CR changes only if PPC_RC (insn) is set.

> +    case 24:		/* OR Immediate */
> +    case 25:		/* OR Immediate Shifted */
> +    case 26:		/* XOR Immediate */
> +    case 27:		/* XOR Immediate Shifted */
> +    case 28:		/* AND Immediate */
> +    case 29:		/* AND Immediate Shifted */
> +      record_full_arch_list_add_reg (regcache,
> +				     tdep->ppc_gp0_regnum + PPC_RA (insn));
Cases 28/29 (only) also change CR.

> +    case 46:		/* Load Multiple Word */
> +      for (i = PPC_RT (insn); i < 32; i++)
> +	record_full_arch_list_add_reg (regcache,
> +				       tdep->ppc_gp0_regnum + PPC_RT (insn));
Should be "tdep->ppc_gp0_regnum + i", I guess.

> +    case 56:		/* Load Quadword */
Should check that the last two bits are zero.

> +      tmp = (tdep->ppc_gp0_regnum + PPC_RT (insn)) & ~1;
Probably better to apply the & ~1 to PPC_RT (insn) ...
> +      record_full_arch_list_add_reg (regcache, tmp);
> +      record_full_arch_list_add_reg (regcache, tmp | 1);
... and use +1 here.  Otherwise you require that tdep->ppc_gp0_regnum
is even, which is currently true but not necessarily guaranteed.

> +    case 47:		/* Store Multiple Word */
> +	{
> +	  ULONGEST addr = 0;
> +	  int size;
Unused?
> +
> +	  if (PPC_RA (insn) != 0)
> +	    regcache_raw_read_unsigned (regcache,
> +					tdep->ppc_gp0_regnum + PPC_RA (insn),
> +					&addr);
> +
> +	  addr += PPC_D (insn);
> +
> +	  for (i = PPC_RS (insn); i < 32; i++)
> +	    record_full_arch_list_add_mem (addr + i * 4, 4);
The offset is wrong; the first register is stored at addr.
Is there a reason why this can't be just a single record call?

> +    case 57:		/* Load Floating-Point Double Pair */
> +      tmp = (tdep->ppc_fp0_regnum + PPC_RT (insn)) & ~1;
> +      record_full_arch_list_add_reg (regcache, tmp);
> +      record_full_arch_list_add_reg (regcache, tmp | 1);
See above (Load Quadword).

> +    case 58:		/* Load Doubleword */
> +			/* Load Doubleword with Update */
> +			/* Load Doubleword Algebraic */
Load Word Algebraic.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com