This is the mail archive of the
mailing list for the glibc project.
Re: New optimized string routines for Intel and alignment of stack.
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Tue, 7 Jun 2016 08:51:30 -0300
- Subject: Re: New optimized string routines for Intel and alignment of stack.
- Authentication-results: sourceware.org; auth=none
- References: <57566200 dot 2040203 at redhat dot com> <dea8c68f-cc02-9427-4e54-acd795a930cf at redhat dot com>
On 07/06/2016 06:52, Florian Weimer wrote:
> On 06/07/2016 07:56 AM, Carlos O'Donell wrote:
>> We have had several users that have built legacy applications
>> for 32-bit x86 with stack alignment that does not match the
> Let's say the GNU project broke the i386 ABI, which is more accurate. The stack pointer alignment requirement is a recent change.
>> In all of these cases it has to do with the application
>> having been compiled with -falign-stack=assume-4-byte which
>> violates the ABI, usually with icc. However, if you're careful
>> it all just works.
> It will get worse with increased vectorization and GCC 6. We already saw this on x86_64 with the non-compliant malloc in tcsh, where GCC 6 used vector instructions to copy a struct dirstream object. I assume this could easily happen with any stack-to-stack copy with SSE2 enabled.
> Currently, GCC does not seem to exploit the fact that it knows the alignment of stack objects. I played with this:
> struct fields
> double a, b;
> struct fields get (void);
> void put (struct fields *, struct fields *);
> copy (void)
> struct fields f1 = get ();
> struct fields f2 = f1;
> put (&f1, &f2);
> And: gcc -m32 -O3 -msse2 -march=westmere -mtune=westmere -o- -S stack-align.c
> I expected to see an SSE load/store for the copy, but that's not what I got.
> I think we need to decide if we want to roll back the ABI change before GCC learns about this optimization because eventually, it will not just be a matter of string routines. Any glibc code optimized for 32-bit x86 CPUs with SSE2 enabled could be affected.
Besides string routines, do we have any C code that relies on stack alignment
on 32-bit x86?
Also, is there any performance issue with current unaligned version or are you
just worry that we might remove them in the future due a better optimized version
that assumes aligned stack?