[patch] Use unaligned access on x86_64

Cary Coutant ccoutant@gmail.com
Mon Jun 1 23:48:00 GMT 2015


> x86_64 has exquisite support for unaligned loads. It is a shame not to use it.
>
> The attached patch avoids aligning archive members on x86_64. The
> results when linking clang are very interesting:
>
> * massif reports that the malloc memory usage goes from 331,295,192
> bytes to just 133,415,136 bytes.
>
> * the linking time (30 runs average) goes from
>
> 1.310065610 seconds time elapsed ( +-  0.19% )
>
> to
>
> 1.162564763 seconds time elapsed ( +-  0.14% )

Hmmm, I guess x86 has gotten a lot better with this.

I'd rather have a configure flag that tells us whether the host
platform can do unaligned access without (much) penalty. I did a quick
search but didn't come up with anything provided by autoconf. Maybe
add a configure option like --enable-fast-unaligned-access? Other
suggestions? Write a micro-benchmark for configure to run on the fly?

(I'm kind of surprised that I couldn't find an autoconf macro for this
-- I'd think that the ability to use unaligned loads/stores is
something that lots of programs would want to test for at configure
time.)

On the other hand, the archive format should generally keep things on
4-byte boundaries -- the magic string is 8 bytes, archive headers are
60 bytes, and ELF file members will generally be a multiple of 4 or 8
bytes in length. The symbol map should be a multiple of 4, but I'll
bet it's the long-file name table that's throwing everything out of
alignment. If we could just fix that, we could probably improve
archive performance on many platforms where unaligned loads are not
fast. Of course, for 64-bit targets, we're going to insist on 8-byte
alignment, so to avoid the malloc-and-copy, we'd have to arrange for
archive members to be 8-byte aligned.

Also, have you tried thin archives?

-cary



More information about the Binutils mailing list