[PATCH v6] Support ymm rounding control for Intel AVX10.2

Jan Beulich jbeulich@suse.com
Thu Aug 29 14:51:12 GMT 2024


On 28.08.2024 07:34, Haochen Jiang wrote:
> Nit: I will rebase to the following Jan's patch after that patch
> committed into trunk:
> 
> https://sourceware.org/pipermail/binutils/2024-August/136506.html

I mean to get to that tomorrow; thanks for your patience.

> ---
> 
> Changes in v6:
> 
>   - Remove all the AVX10_2 CPUID addition in table.
>   - Change the expression for U bit since it is bool now.
>   - Fix the indentation in comment part.
> 
> ---
> 
> In the patch, in order to support ymm rounding for AVX10.2, we derive
> evex attribute for all cases instead of only for rc_none to encode U bit.
> Also changed some bad_opcode return due to the share of U bit with APX_F.
> 
> gas/ChangeLog:
> 
> 	* config/tc-i386.c
> 	(cpu_flags_match): Handle AVX10_2.
> 	(build_evex_prefix): Handle U bit. Derive evex attribute
> 	for all cases.
> 	(check_VecOperands): Handle AVX10.2 and ymm roundings.
> 	* doc/c-i386.texi: Document .avx10.2.
> 	* testsuite/gas/i386/i386.exp: Run AVX10.2 tests.
> 	* testsuite/gas/i386/x86-64.exp: Ditto.
> 	* testsuite/gas/i386/avx10_2-rounding-intel.d: New test.
> 	* testsuite/gas/i386/avx10_2-rounding-inval.l: Ditto.
> 	* testsuite/gas/i386/avx10_2-rounding-inval.s: Ditto.
> 	* testsuite/gas/i386/avx10_2-rounding.d: Ditto.
> 	* testsuite/gas/i386/avx10_2-rounding.s: Ditto.
> 	* testsuite/gas/i386/x86-64-avx10_2-rounding-intel.d: Ditto.
> 	* testsuite/gas/i386/x86-64-avx10_2-rounding.d: Ditto.
> 	* testsuite/gas/i386/x86-64-avx10_2-rounding.s: Ditto.
> 
> opcodes/ChangeLog:
> 
> 	* i386-dis.c (struct instr_info): Add U bit.
> 	(get_valid_dis386): Handle U bit.
> 	* i386-gen.c (isa_dependencies): Add AVX10.2.
> 	(cpu_flags): Ditto.
> 	* i386-init.h: Regenerated.
> 	* i386-opc.h (CpuAVX10_2): New.
> 	(i386_cpu_flags): Add cpuavx10_2.
> 	* i386-opc.tbl: Add rounding to old entries which do not
> 	permit rounding previously.
> 	* i386-tbl.h: Regenerated.

Okay, with one minor tidying request:

> @@ -2936,10 +2936,10 @@ vcvtpd2uqq, 0x6679, AVX512DQ, Modrm|Masking|Space0F|VexW1|Broadcast|Disp8ShiftVL
>  
>  vcvtps2qq, 0x667B, AVX512DQ, Modrm|EVex512|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=5|NoSuf|StaticRounding|SAE, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM }
>  vcvtps2qq, 0x667B, AVX512DQ&AVX512VL, Modrm|EVex128|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM }
> -vcvtps2qq, 0x667B, AVX512DQ&AVX512VL, Modrm|EVex256|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=4|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
> +vcvtps2qq, 0x667B, AVX512DQ&AVX512VL, Modrm|EVex256|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
>  vcvtps2uqq, 0x6679, AVX512DQ, Modrm|EVex512|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=5|NoSuf|StaticRounding|SAE, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM }
>  vcvtps2uqq, 0x6679, AVX512DQ&AVX512VL, Modrm|EVex128|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM }
> -vcvtps2uqq, 0x6679, AVX512DQ&AVX512VL, Modrm|EVex256|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=4|NoSuf, { RegXMM|RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
> +vcvtps2uqq, 0x6679, AVX512DQ&AVX512VL, Modrm|EVex256|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|RegXMM|Dword|Unspecified|BaseIndex, RegYMM }

As you're touching this line, may I ask that you also drop the double RegXMM?
One of the things which likely could have been avoided if we used templatization
yet more extensively.

Jan


More information about the Binutils mailing list