This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH RFC] Imporve 64bit memset performance for Haswell CPU with AVX2 instruction
- From: Ling Ma <ling dot ma dot program at gmail dot com>
- To: OndÅej BÃlka <neleai at seznam dot cz>
- Cc: "H.J. Lu" <hjl dot tools at gmail dot com>, GNU C Library <libc-alpha at sourceware dot org>, Richard Henderson <rth at twiddle dot net>, Andreas Jaeger <aj at suse dot com>, Liubov Dmitrieva <liubov dot dmitrieva at gmail dot com>, Ling Ma <ling dot ml at alibaba-inc dot com>
- Date: Tue, 10 Jun 2014 21:52:31 +0800
- Subject: Re: [PATCH RFC] Imporve 64bit memset performance for Haswell CPU with AVX2 instruction
- Authentication-results: sourceware.org; auth=none
- References: <1396850238-29041-1-git-send-email-ling dot ma at alipay dot com> <20140513173616 dot GC5047 at domone dot podge> <20140515201458 dot GA24885 at domone dot podge> <CAOGi=dNmn2bPfB65VoXUGjQ7t6RLVJ2hj2QDarrUjZV75kTbDA at mail dot gmail dot com> <20140530113041 dot GB26528 at domone dot podge> <CAOGi=dPdWegEo1s8=wG4WzOANaQ3x=boLFitQ_wBp+Xf+hxexQ at mail dot gmail dot com> <CAMe9rOqv5RYK1MO2M098n3o50-KmmZJuvsvMmXqkBBt0g3OY_g at mail dot gmail dot com> <CAOGi=dMNyzckY8s3uF0qRpKuqUwHHhzQeyy-j29ydLNn_s9Bog at mail dot gmail dot com> <20140605163224 dot GA8041 at domone dot podge> <CAOGi=dN-kC5tZ3ZMhjijGqK+3ePVMrOsT1M2EOhOnmhWMW7kpg at mail dot gmail dot com>
In this patch as gziped attachment, we take advantage of HSW memory
bandwidth, manage to reduce miss branch prediction by avoiding using
branch instructions and
force destination to be aligned with avx & avx2 instruction.
The CPU2006 403.gcc benchmark indicates this patch improves performance
from 26% to 59%.
This version accept Ondra's comments and avoid branch instruction to
cross 16byte-aligned code.
Thanks
Ling
Attachment:
memset-avx2.patch.tar.gz
Description: GNU Zip compressed data