This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] aarch64: optimize the unaligned case of memcmp

From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
To: Sebastian Pop <sebpop at gmail dot com>
Cc: Sebastian Pop <s dot pop at samsung dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>, "maxim dot kuvyrkov at linaro dot org" <maxim dot kuvyrkov at linaro dot org>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>, "ryan dot arnold at linaro dot org" <ryan dot arnold at linaro dot org>, "adhemerval dot zanella at linaro dot org" <adhemerval dot zanella at linaro dot org>, nd <nd at arm dot com>
Date: Mon, 26 Jun 2017 20:17:47 +0000
Subject: Re: [PATCH] aarch64: optimize the unaligned case of memcmp
Authentication-results: sourceware.org; auth=none
Authentication-results: samsung.com; dkim=none (message not signed) header.d=none;samsung.com; dmarc=none action=none header.from=arm.com;
Nodisclaimer: True
References: <CGME20170622233226uscas1p213aefedba5fe47e520aac1226a731162@uscas1p2.samsung.com> <1498174226-16525-1-git-send-email-s.pop@samsung.com> <637cf51c-160d-172f-6520-bba51058f85e@samsung.com> <AM5PR0802MB26106339AAEF3DABB5ACE56F83D80@AM5PR0802MB2610.eurprd08.prod.outlook.com> <19ed586c-9724-cdc4-177f-174f880864a4@samsung.com> <AM5PR0802MB2610E38DEE75A9457B824C7C83DF0@AM5PR0802MB2610.eurprd08.prod.outlook.com>,<CAFk3UF_ek2HQzQ_Cr_CPg7pstn5MYnGxajiaaTvM--w43GXzCA@mail.gmail.com>
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99

Sebastian Pop wrote:
>
> And for larger data sets the performance is still lower than when aligning src1:

Benchmark                                Time           CPU Iterations
----------------------------------------------------------------------
BM_string_memcmp_unaligned/8          1288 ns       1288 ns     543230
   5.9221MB/s
BM_string_memcmp_unaligned/64         2377 ns       2377 ns     359351
  25.6742MB/s
BM_string_memcmp_unaligned/512        6444 ns       6444 ns     184103
  75.7774MB/s
BM_string_memcmp_unaligned/1024       4869 ns       4868 ns     143785
  200.599MB/s
BM_string_memcmp_unaligned/8k        33090 ns      33089 ns      21279
  236.107MB/s
BM_string_memcmp_unaligned/16k       66748 ns      66738 ns      10436
  234.123MB/s
BM_string_memcmp_unaligned/32k      131781 ns     131775 ns       5106
  237.147MB/s
BM_string_memcmp_unaligned/64k      291907 ns     291860 ns       2334
  214.143MB/s

These numbers still don't make any sense, the first few results are now many times slower
than the byte-by-byte version (as in your initial mail):

BM_string_memcmp_unaligned/8           339 ns        339 ns    2070998   22.5302MB/s
BM_string_memcmp_unaligned/64         1392 ns       1392 ns     502796   43.8454MB/s
BM_string_memcmp_unaligned/512        9194 ns       9194 ns      76133   53.1104MB/s
BM_string_memcmp_unaligned/1024      18325 ns      18323 ns      38206   53.2963MB/s
BM_string_memcmp_unaligned/8k       148579 ns     148574 ns       4713   52.5831MB/s
BM_string_memcmp_unaligned/16k      298169 ns     298120 ns       2344   52.4118MB/s
BM_string_memcmp_unaligned/32k      598813 ns     598797 ns       1085    52.188MB/s
BM_string_memcmp_unaligned/64k     1196079 ns    1196083 ns        540   52.2539MB/s

Wilco

Follow-Ups:
- Re: [PATCH] aarch64: optimize the unaligned case of memcmp
  - From: Wilco Dijkstra

References:
- [PATCH] aarch64: optimize the unaligned case of memcmp
  - From: Sebastian Pop
- Re: [PATCH] aarch64: optimize the unaligned case of memcmp
  - From: Sebastian Pop
- Re: [PATCH] aarch64: optimize the unaligned case of memcmp
  - From: Wilco Dijkstra
- Re: [PATCH] aarch64: optimize the unaligned case of memcmp
  - From: Sebastian Pop
- Re: [PATCH] aarch64: optimize the unaligned case of memcmp
  - From: Wilco Dijkstra
- Re: [PATCH] aarch64: optimize the unaligned case of memcmp
  - From: Sebastian Pop

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]