PATCH: Split AVX 32byte unalignd load/store
H.J. Lu
hjl.tools@gmail.com
Mon Mar 28 04:00:00 GMT 2011
On Sun, Mar 27, 2011 at 11:57 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sun, Mar 27, 2011 at 10:53 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Sun, Mar 27, 2011 at 3:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>>> Here is a patch to split AVX 32byte unalignd load/store:
>>>
>>> http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00743.html
>>>
>>> It speeds up some SPEC CPU 2006 benchmarks by up to 6%.
>>> OK for trunk?
>>
>>> 2011-02-11 H.J. Lu <hongjiu.lu@intel.com>
>>>
>>> * config/i386/i386.c (flag_opts): Add -mavx256-split-unaligned-load
>>> and -mavx256-split-unaligned-store.
>>> (ix86_option_override_internal): Split 32-byte AVX unaligned
>>> load/store by default.
>>> (ix86_avx256_split_vector_move_misalign): New.
>>> (ix86_expand_vector_move_misalign): Use it.
>>>
>>> * config/i386/i386.opt: Add -mavx256-split-unaligned-load and
>>> -mavx256-split-unaligned-store.
>>>
>>> * config/i386/sse.md (*avx_mov<mode>_internal): Verify unaligned
>>> 256bit load/store. Generate unaligned store on misaligned memory
>>> operand.
>>> (*avx_movu<ssemodesuffix><avxmodesuffix>): Verify unaligned
>>> 256bit load/store.
>>> (*avx_movdqu<avxmodesuffix>): Likewise.
>>>
>>> * doc/invoke.texi: Document -mavx256-split-unaligned-load and
>>> -mavx256-split-unaligned-store.
>>>
>>> gcc/testsuite/
>>>
>>> 2011-02-11 H.J. Lu <hongjiu.lu@intel.com>
>>>
>>> * gcc.target/i386/avx256-unaligned-load-1.c: New.
>>> * gcc.target/i386/avx256-unaligned-load-2.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-load-3.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-load-4.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-load-5.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-load-6.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-load-7.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-store-2.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-store-5.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-store-6.c: Likewise.
>>> * gcc.target/i386/avx256-unaligned-store-7.c: Likewise.
>>>
>>
>>
>>
>>> @@ -203,19 +203,37 @@
>>> return standard_sse_constant_opcode (insn, operands[1]);
>>> case 1:
>>> case 2:
>>> + if (GET_MODE_ALIGNMENT (<MODE>mode) == 256
>>> + && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE
>>> + && MEM_P (operands[0])
>>> + && MEM_ALIGN (operands[0]) < 256)
>>> + || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD
>>> + && MEM_P (operands[1])
>>> + && MEM_ALIGN (operands[1]) < 256)))
>>> + gcc_unreachable ();
>>
>> Please use "misaligned_operand (operands[...], <MODE>mode)" instead of
>> MEM_P && MEM_ALIGN combo in a couple of places.
>>
>> OK with that change.
>>
>
> This is the patch I checked in.
>
I checked in this patch to revert unaligned 256bit load/store since
they may be generated by intriniscs:
http://gcc.gnu.org/ml/gcc-regression/2011-03/msg00477.html
--
H.J.
---
Index: ChangeLog
===================================================================
--- ChangeLog (revision 171589)
+++ ChangeLog (working copy)
@@ -1,3 +1,10 @@
+2011-03-27 H.J. Lu <hongjiu.lu@intel.com>
+
+ * config/i386/sse.md (*avx_mov<mode>_internal): Don't assert
+ unaligned 256bit load/store.
+ (*avx_movu<ssemodesuffix><avxmodesuffix>): Likewise.
+ (*avx_movdqu<avxmodesuffix>): Likewise.
+
2011-03-27 Vladimir Makarov <vmakarov@redhat.com>
PR bootstrap/48307
Index: config/i386/sse.md
===================================================================
--- config/i386/sse.md (revision 171589)
+++ config/i386/sse.md (working copy)
@@ -203,12 +203,6 @@
return standard_sse_constant_opcode (insn, operands[1]);
case 1:
case 2:
- if (GET_MODE_ALIGNMENT (<MODE>mode) == 256
- && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE
- && misaligned_operand (operands[0], <MODE>mode))
- || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD
- && misaligned_operand (operands[1], <MODE>mode))))
- gcc_unreachable ();
switch (get_attr_mode (insn))
{
case MODE_V8SF:
@@ -416,15 +410,7 @@
UNSPEC_MOVU))]
"AVX_VEC_FLOAT_MODE_P (<MODE>mode)
&& !(MEM_P (operands[0]) && MEM_P (operands[1]))"
-{
- if (GET_MODE_ALIGNMENT (<MODE>mode) == 256
- && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE
- && misaligned_operand (operands[0], <MODE>mode))
- || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD
- && misaligned_operand (operands[1], <MODE>mode))))
- gcc_unreachable ();
- return "vmovu<ssemodesuffix>\t{%1, %0|%0, %1}";
-}
+ "vmovu<ssemodesuffix>\t{%1, %0|%0, %1}"
[(set_attr "type" "ssemov")
(set_attr "movu" "1")
(set_attr "prefix" "vex")
@@ -483,15 +469,7 @@
[(match_operand:AVXMODEQI 1 "nonimmediate_operand" "xm,x")]
UNSPEC_MOVU))]
"TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
-{
- if (GET_MODE_ALIGNMENT (<MODE>mode) == 256
- && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE
- && misaligned_operand (operands[0], <MODE>mode))
- || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD
- && misaligned_operand (operands[1], <MODE>mode))))
- gcc_unreachable ();
- return "vmovdqu\t{%1, %0|%0, %1}";
-}
+ "vmovdqu\t{%1, %0|%0, %1}"
[(set_attr "type" "ssemov")
(set_attr "movu" "1")
(set_attr "prefix" "vex")
More information about the Gcc-patches
mailing list