This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: RFC: Should x86-64 support arbitrary calling conventions?
On Fri, Mar 24, 2017 at 2:31 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * H. J. Lu:
>
>> I compared time for "make check" in glibc. On Nehalem and Skylake,
>> the time differences are within noises. On Knights Landing, xsave
>> is about 1% slower.
>
> Thanks for doing this benchmarking.
>
> What's the increase in stack usage?
We need 128 (8 * 16) bytes to save XMM registers, 256 (8 * 32) bytes
to save YMM registers, 512 (8 * 64) bytes to save ZMM regisers and
64 (4 * 16) bytes to save BND registers.
We use 512 bytes to save all XMM registers with fxsave. This is 128 bytes
vs 512 bytes. For xsave, stack usage varies, depending on processors.
On Haswell, it is 256 bytes vs. 896 bytes. On Skylake, it is 320 (256 + 64)
bytes vs 1152 bytes. On Skylake server, it is 576 (512 + 64) bytes vs 2816
bytes.
>> I don't expect xsave will make any differences for long running
>> benchmarks. Its impact may only show up on short programs which
>> call external functions a few times with lazy binding.
>>
>> Should we consider it at all?
>
> I think the main benefit is that we don't have to adjust the dynamic
> linker trampoline for each new microarchitecture, and applications can
> safely start using new CPU features once the kernel indicates support.
That is true.
--
H.J.