[PATCH v1] x86/string: Fixup alignment of main loop in str{n}cmp-evex [BZ #32212]

Alexander Monakov amonakov@ispras.ru
Sat Sep 28 06:06:03 GMT 2024


On Fri, 27 Sep 2024, Noah Goldstein wrote:

> The loop should be aligned to 32-bytes so that it can ideally run out
> the DSB. This is particularly important on Skylake-Server where
> deficiencies in it's DSB implementation make it prone to not being
> able to run loops out of the DSB.

The snippet in comment #13 of the bug suggests it's the well-known
Skylake JCC erratum, although without certainty because the function
is 16-byte aligned before your patch (we only see a branch crossing
a 16-byte boundary).

The disconnect between your paragraph above and the patch where you
align the function to 64 bytes, not 32, is a bit confusing though.
If you're over-aligning the function to reduce alignment padding
in front of loops, can you mention that?

Alexander


More information about the Libc-alpha mailing list