[PATCH] fix altivec/vperm sim for source/destination overlaps

Olivier Hainque hainque@adacore.com
Mon May 5 18:59:00 GMT 2008


Hello,

The simulation of the altivec vperm instruction overlooks the
possibility of having the destination vector be the same as one of the
source vectors, and might clobber input too early when this happens,
depending on the permutation to operate.

This causes the testcase below to abort reliably ...

    #include <altivec.h>

    const vector int v0 = ((const vector int){0, 0, 0, 0});
    const vector int v1 = ((const vector int){1, 1, 1, 1});

    const vector unsigned char perm =
	 ((const vector unsigned char)
	  {0,  1,  2,  3,  4,  5,  6,  7,
	   16, 17, 18, 19, 20, 21, 22, 23 });

    int main ()
    {
      vector int lv;
      vector int ra, rb;

      ra = vec_perm (v0, v1, perm);

      lv = v1;
      rb = vec_perm (v0, lv, perm);

      if (vec_any_ne (ra, rb))
	abort ();

      return 0;
    }

... when compiled with a recent mainline powerpc-elf GCC.
powerpc-elf-gcc -S vperm.c -maltivec -mabi=altivec produces ...

   main: 
   ...
   vperm 0,13,1,0
   ...
   vperm 0,13,0,1
 
and the second operation misbehaves.

The attached patch is a suggestion to address this by simply latching
the input vectors into temporaries to make sure the input bits are
preserved in all cases.

Tested with a cross GDB 6.8 (powerpc-elf target) hosted on
sparc-solaris by exercising a proprietary altivec based fft
computation (used to fail before the change, works fine after the
change).

I also checked that I could rebuild such a cross from cvs mainline on
x86_64-suse-linux, and that make -k check yields identical results
before and after the change with a base cross gcc mainline (no libc,
2306 expected passes).

Thanks in advance,

Olivier

--

2008-05-05  Olivier Hainque  <hainque@adacore.com>

	* ppc/altivec.igen (vperm): Latch inputs into temporaries.







-------------- next part --------------
*** ./sim/ppc/altivec.igen.ori	Fri May  2 14:14:43 2008
--- ./sim/ppc/altivec.igen	Fri May  2 14:56:53 2008
*************** unsigned32::model-function::altivec_unsi
*** 1634,1645 ****
  
  0.4,6.VS,11.VA,16.VB,21.VC,26.43:VX:av:vperm %VD, %VA, %VB, %VC:Vector Permute
  	int i, who;
  	for (i = 0; i < 16; i++) {
  	  who = (*vC).b[AV_BINDEX(i)] & 0x1f;
  	  if (who & 0x10)
! 	    (*vS).b[AV_BINDEX(i)] = (*vB).b[AV_BINDEX(who & 0xf)];
  	  else
! 	    (*vS).b[AV_BINDEX(i)] = (*vA).b[AV_BINDEX(who & 0xf)];
  	}
  	PPC_INSN_VR(VS_BITMASK, VA_BITMASK | VB_BITMASK | VC_BITMASK);
  
--- 1634,1650 ----
  
  0.4,6.VS,11.VA,16.VB,21.VC,26.43:VX:av:vperm %VD, %VA, %VB, %VC:Vector Permute
  	int i, who;
+ 	/* The permutation vector might have us read into the source vectors
+ 	   back at positions before the iteration index, so we must latch the
+ 	   sources to prevent early-clobbering in case the destination vector
+ 	   is the same as one of them.  */
+ 	vreg myvA = (*vA), myvB = (*vB);
  	for (i = 0; i < 16; i++) {
  	  who = (*vC).b[AV_BINDEX(i)] & 0x1f;
  	  if (who & 0x10)
! 	    (*vS).b[AV_BINDEX(i)] = myvB.b[AV_BINDEX(who & 0xf)];
  	  else
! 	    (*vS).b[AV_BINDEX(i)] = myvA.b[AV_BINDEX(who & 0xf)];
  	}
  	PPC_INSN_VR(VS_BITMASK, VA_BITMASK | VB_BITMASK | VC_BITMASK);
  


More information about the Gdb-patches mailing list