[PATCH 2/7 v3] Tracepoint for ppc64.

Ulrich Weigand uweigand@de.ibm.com
Wed Apr 8 16:57:00 GMT 2015


Wei-cheng Wang wrote:

> The fail cases in gdb-unavailable are caused by inaccessible vtable for object
> For x86, such structures (_ZVTxxx) are put in .rodata section, so gdb can read
> them in local file if not collected by tracepoint action, but for PowerpPC,
> they are put in .rel.ro sections.  Technically they not read-only, so gdb won't
> read them in file if not collected.

Is this really a difference between Intel and PowerPC, or this is rather
a difference between different binutils levels?  I don't see off-hand
why this should be Power-specific ...  Do you have example assembler
code that shows the difference?

> +/* Generate a sequence of instructions to load IMM in the register REG.
> +   Write the instructions in BUF and return the number of bytes written.  */
> +
> +static int
> +gen_limm (uint32_t *buf, int reg, uint64_t imm)
> +{
> +  uint32_t *p = buf;
> +
> +  if ((imm + 32768) < 65536)
> +    {
> +      /* li	reg, imm[7:0] */
> +      p += GEN_LI (p, reg, imm);
> +    }
> +  else if ((imm >> 16) == 0)
> +    {
> +      /* li	reg, 0
> +	 ori	reg, reg, imm[15:0] */
> +      p += GEN_LI (p, reg, 0);
> +      p += GEN_ORI (p, reg, reg, imm);
> +    }
> +  else if ((imm >> 32) == 0)
> +    {
> +      /* lis	reg, imm[31:16]
> +	 ori	reg, reg, imm[15:0]
> +	 rldicl	reg, reg, 0, 32 */
> +      p += GEN_LIS (p, reg, (imm >> 16) & 0xffff);
> +      p += GEN_ORI (p, reg, reg, imm & 0xffff);

Ah, you just took that RLDICL out completely, that's also not correct.

You don't need the RLDICL if and only if the high bit of imm is zero.
The problem is that LIS will fill all the upper 32 bits of the target
register with copies of the high bit of the lower 32 bits.

(You could in theory also use the LIS/ORI pair to load up negative
values where all the high bits are *supposed* to be set, just like
you did for the LI case.)

> +/* Implement install_fast_tracepoint_jump_pad of target_ops.
> +   See target.h for details.  */
> +
> +static int
> +ppc_install_fast_tracepoint_jump_pad (CORE_ADDR tpoint, CORE_ADDR tpaddr,
> +				      CORE_ADDR collector,
> +				      CORE_ADDR lockaddr,
> +				      ULONGEST orig_size,
> +				      CORE_ADDR *jump_entry,
> +				      CORE_ADDR *trampoline,
> +				      ULONGEST *trampoline_size,
> +				      unsigned char *jjump_pad_insn,
> +				      ULONGEST *jjump_pad_insn_size,
> +				      CORE_ADDR *adjusted_insn_addr,
> +				      CORE_ADDR *adjusted_insn_addr_end,
> +				      char *err)
> +{
> +  uint32_t buf[256];
> +  uint32_t *p = buf;
> +  int j, offset;
> +  CORE_ADDR buildaddr = *jump_entry;
> +  const CORE_ADDR entryaddr = *jump_entry;
> +#if __PPC64__

The more usual check is
#ifdef __powerpc64__

(here and elsewhere)

> +  const int rsz = 8;
> +#else
> +  const int rsz = 4;
> +#endif
> +  /* Minimum frame size is 32 bytes for ELFv2, and 112 bytes for ELFv1.  */
> +  const int min_frame = 112;

Since this is known at compile time anyway, you might as well do
#ifdef __powerpc64__
#if _CALL_ELF == 2
 const int min_frame = 32;
#else
 const int min_frame = 112;
#endif
#else
 const int min_frame = 8;
#endif

> +  /* Adjust stack pointer.  */
> +  p += GEN_STDU (p, 1, 1, -frame_size);		/* stdu   r1,-frame_size(r1) */

OK, this fixes the ABI violation, but now it's valid only for 64-bit.
For 32-bit, you need to use STWU instead.

> +  p += gen_atomic_xchg (p, lockaddr, 0, 1);
> +  /* Call to collector.  */
> +  p += gen_call (p, collector);
> +  p += gen_atomic_xchg (p, lockaddr, 1, 0);

See other email thread about the lockaddr problem.

> +  p += GEN_LOAD (p, 6, 1, min_frame + 35 * rsz);	/* ld	r6, 35(r1) */
> +  p += GEN_MTCR (p, 3);					/* mtcr	  r3 */
> +  p += GEN_MTSPR (p, 4, 1);				/* mtxer  r4 */
> +  p += GEN_MTSPR (p, 5, 8);				/* mtlr   r5 */
> +  p += GEN_MTSPR (p, 6, 9);				/* mtctr  r6 */
> +
> +  /* Restore GPRs.  */
> +  for (j = 2; j < 32; j++)
> +    p += GEN_LOAD (p, j, 1, min_frame + j * rsz);
> +  p += GEN_LOAD (p, 0, 1, min_frame + 0 * rsz);
> +  p += GEN_LOAD (p, 1, 1, min_frame + 1 * rsz);

Ah, this is wrong now: note that above, you first decremented
the stack pointer, and *then* stored the *modified* value to
the stack.  So this final load is now a no-op.

Instead of attempting to restore, I'd simply use
  p += GEN_ADDI (p, 1, 1, frame_size);
*here* to get back to the original stack pointer value.

> +static void
> +ppc64_emit_stack_adjust (int n)
> +{
> +  uint32_t buf[4];
> +  uint32_t *p = buf;
> +
> +  n = n << 3;
> +  if ((n >> 7) != 0)

ADDI supports up to 15-bit unsigned operands,
so this seems overly strict.


> +      if (newrel < (1 << 15) && newrel >= -(1 << 15))
> +	insn = (insn & ~0xfffc) | (newrel & 0xfffc);
> +      else if ((PPC_BO (insn) & 0x14) == 0x4 || (PPC_BO (insn) & 0x14) == 0x10)
> +	{
> +	  newrel -= 4;
> +
> +	  /* Out of range. Cannot relocate instruction.  */
> +	  if (newrel >= (1 << 25) || newrel < -(1 << 25))
> +	    return;
> +
> +	  if ((PPC_BO (insn) & 0x14) == 0x4)
> +	    insn ^= (1 << 24);
> +	  else if ((PPC_BO (insn) & 0x14) == 0x10)
> +	    insn ^= (1 << 22);
> +
> +	  /* Jump over the unconditional branch.  */
> +	  insn = (insn & ~0xfffc) | 0x8;
> +	  write_memory_unsigned_integer (*to, 4, byte_order, insn);
> +	  *to += 4;
> +
> +	  /* Build a unconditional branch and copy LK bit.  */
> +	  insn = (18 << 26) | (0x3fffffc & newrel) | (insn & 0x3);
> +	  write_memory_unsigned_integer (*to, 4, byte_order, insn);
> +	  *to += 4;
> +
> +	  return;
> +	}
> +      else

This should check for (PPC_BO (insn) & 0x14) == 0.
Note that (PPC_BO (insn) & 0x14) == 0x14 is the "branch always"
case, which should be replaced simply by an unconditional branch.

> +	{
> +	  uint32_t bdnz_insn = (16 << 26) | (0x10 << 21) | 12;
> +	  uint32_t bf_insn = (16 << 26) | (0x4 << 21) | 8;
> +
> +	  newrel -= 12;

That should be -= 8, shouldn't it?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com



More information about the Gdb-patches mailing list