This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Rename __memcmp_sse4_2 to __memcmp_sse4_1.


Probably it is worth to rename it to memcpy_sse4_1_unaligned because
more important that it uses unaligned loads than ptests instructions.

--
Liubov

On Thu, Jul 11, 2013 at 6:07 PM, Liubov Dmitrieva
<liubov.dmitrieva@gmail.com> wrote:
> My Silvermont patch in the latest edition doesn't touch memcmp and
> wmemcmp at all because I didn't see good boost from switching SSE42
> off for these 2 functions.
> Now I see why. There are no SSE42 instruction there. :)
> The patch looks good. I will just check performance regressions for Penryn.
>
> --
> Liubov
>
> On Wed, Jul 10, 2013 at 10:23 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, Jul 10, 2013 at 11:19 AM, Matt Turner <mattst88@gmail.com> wrote:
>>> On Wed, Jul 10, 2013 at 11:16 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Wed, Jul 10, 2013 at 10:41 AM, Matt Turner <mattst88@gmail.com> wrote:
>>>>> On Wed, Jul 10, 2013 at 8:30 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>> On Tue, Jul 9, 2013 at 9:37 PM, Andreas Jaeger <aj@suse.com> wrote:
>>>>>>> On 07/10/2013 03:17 AM, Matt Turner wrote:
>>>>>>>> It uses SSE 4.1 instructions (ptest) but no SSE 4.2 instructions.
>>>>>>>
>>>>>>> There are two parts to this: It should only run on cpus with those
>>>>>>> instructions but we also need to ensure that it gives a better
>>>>>>> performance on such cpus. HJ, Matt, please do run performance tests on a
>>>>>>> variety of affected cpus to show that this change really helps in all cases,
>>>>>>>
>>>>>>> Andreas
>>>>>>
>>>>>> Only Penryn has SSE4.1 without SSE4.2.  Liubov, can
>>>>>> you compare performance of memcmp-sse4.S vs
>>>>>> memcmp-ssse3.S on Penryn?
>>>>>
>>>>> Is it also the case that this path would now be used on Silvermont?
>>>>
>>>> It is used on Silvermont since it supports SSE4.2
>>>>
>>>> --
>>>> H.J.
>>>
>>> To confirm, setting bit_Slow_SSE4_2 on Silvermont (which we do)
>>> wouldn't prevent this path from executing?
>>
>> I don't think so.  Liubov, can you verify it?
>>
>> --
>> H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]