Inefficient use of 64-bit addresses in Clang

Agner Fog agner@agner.org
Tue Aug 6 12:30:00 GMT 2019


Clang is using 64-bit absolute addresses when accessing static data in 
64-bit mode. This is inefficient because it requires an extra 10-bytes 
long instruction for loading an address into a register every time it 
needs to access static data. All other compilers use relative addresses.

Example:

> #include <immintrin.h>
>
> __m128d test (__m128d a) {
>     __m128d b = _mm_add_pd(a, _mm_set1_pd(1.5));
>     __m128d c = _mm_mul_pd(b, _mm_set1_pd(2.5));
>     return c;
> }

Assembly output:

> .LCPI0_0:
>     .quad    4609434218613702656     # double 1.5
>     .quad    4609434218613702656     # double 1.5
> .LCPI0_1:
>     .quad    4612811918334230528     # double 2.5
>     .quad    4612811918334230528     # double 2.5
>     .text
>     .globl    _Z4testDv2_d
>     .p2align    4, 0x90
> _Z4testDv2_d:                           # @_Z4testDv2_d
> # BB#0:
>     vmovapd    (%rcx), %xmm0
>     movabsq    $.LCPI0_0, %rax
>     vaddpd    (%rax), %xmm0, %xmm0
>     movabsq    $.LCPI0_1, %rax
>     vmulpd    (%rax), %xmm0, %xmm0
>     retq

Linux Clang uses 32-bit relative addresses:

>     vaddpd    .LCPI0_0(%rip), %xmm0, %xmm0
>     vmulpd    .LCPI0_1(%rip), %xmm0, %xmm0
>     retq


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list