Tighten ARM's CANNOT_CHANGE_MODE_CLASS

Richard Sandiford richard.sandiford@linaro.org
Thu Mar 24 15:41:00 GMT 2011


We currently generate very poor code for tests like:

#include <arm_neon.h>

void
foo (uint32_t *a, uint32_t *b, uint32_t *c)
{
  uint32x4x3_t x, y;

  x = vld3q_u32 (a);
  y = vld3q_u32 (b);
  x.val[0] = vaddq_u32 (x.val[0], y.val[0]);
  x.val[1] = vaddq_u32 (x.val[1], y.val[1]);
  x.val[2] = vaddq_u32 (x.val[2], y.val[2]);
  vst3q_u32 (a, x);
}

This is because we force the uint32x4x3_t values to the stack and
then load and store the individual vectors.

What we actually want is for the uint32x4x3_t values to be stored
in registers, and for the individual vectors to be accessed as
subregs of those registers.  The first part involves some middle-end
mode changes (see recent gcc@ thread), while the second part requires
a change to ARM's CANNOT_CHANGE_MODE_CLASS.

CANNOT_CHANGE_MODE_CLASS is defined as:

/* FPA registers can't do subreg as all values are reformatted to internal
   precision.  VFP registers may only be accessed in the mode they
   were set.  */
#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)		\
   ? reg_classes_intersect_p (FPA_REGS, (CLASS))	\
     || reg_classes_intersect_p (VFP_REGS, (CLASS))	\
   : 0)

But this VFP restriction appears to apply only to VFPv1; thanks to
Peter Maydell for the archaeology.

Tested on arm-linux-gnueabi.  OK to install?

This doesn't have any direct benefit without the middle-end mode change,
but it needs to go in first in order for that change not to regress.

Richard


gcc/
	* config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Restrict FPA_REGS
	case to VFPv1.

Index: gcc/config/arm/arm.h
===================================================================
--- gcc/config/arm/arm.h	2011-03-24 13:47:14.000000000 +0000
+++ gcc/config/arm/arm.h	2011-03-24 15:26:19.000000000 +0000
@@ -1167,12 +1167,14 @@ #define IRA_COVER_CLASSES						     \
 }
 
 /* FPA registers can't do subreg as all values are reformatted to internal
-   precision.  VFP registers may only be accessed in the mode they
+   precision.  VFPv1 registers may only be accessed in the mode they
    were set.  */
-#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
-  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)		\
-   ? reg_classes_intersect_p (FPA_REGS, (CLASS))	\
-     || reg_classes_intersect_p (VFP_REGS, (CLASS))	\
+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)		\
+  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)			\
+   ? (reg_classes_intersect_p (FPA_REGS, (CLASS))		\
+      || (TARGET_VFP						\
+	  && arm_fpu_desc->rev == 1				\
+	  && reg_classes_intersect_p (VFP_REGS, (CLASS))))	\
    : 0)
 
 /* The class value for index registers, and the one for base regs.  */



More information about the Gcc-patches mailing list