This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v2 2/2] aarch64: Optimized memcpy and memmove for Kunpeng processor

From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
To: "Zhangxuelei (Derek)" <zhangxuelei4 at huawei dot com>, Yikun Jiang <yikunkero at gmail dot com>
Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, nd <nd at arm dot com>, Siddhesh Poyarekar <siddhesh at gotplt dot org>, jiangyikun <jiangyikun at huawei dot com>, Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>
Date: Tue, 29 Oct 2019 14:34:05 +0000
Subject: Re: [PATCH v2 2/2] aarch64: Optimized memcpy and memmove for Kunpeng processor
Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8q5yXpkPhsowZNuC7u0Ot8+POjPvqqpR5QJ1qNt6OU8=; b=WvTAhGzakuTZgZLMxlu7o7M3b5qICCnJe6CLlfJBv4+xVyezOhsZ2UBIsklZJuUlpbdMv+a6vRnhqvC+vi9DW9VWfzBAAZyVh/RHCkJeP4olJGqPfVPxcpc9iMUdmeXukNYmmNuY9ZmcAj5XYok8nHyH/8YLM3H21tv/Rln3OoCCtspjtcclyJfg+tbJHgR8L6kZ3vD9Yd9iiMxuh/6ZY29ZvvjGWNd5RE53I50pFOiu/BJiepRwiTajDzxOJViVs6GHto6NysAX+9k5UpIeqb4wQDiW6DatdIRKyMjLNhNt/yJgtSh+r0hWnbAgXl/xgzDC+hWR2ADq3eQydkc2KQ==
Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OJp8doLeviS3tnrD5+TMYAVi+7qaZ1r4K9lAGk2v874C4EIBHI4+zPhKyRHyr1HPgaF4he9rPg3bjAAJnjec3IjJfI6eOcW3rh6aWiijCILTGdxddgbgLi+Kk1HAwB7dLcbo9DyTk2Nq+rdjouKQYwo2srZ6yvZ+2dBeGSvHuxpCA7jg0dvvWSYiac6x68D2gbWJJiBr83/uSqU4QVPkAK43m6PEQRiycmyykp7jSf6aDtXP1pqrIlSmYVFRxYH+PMz6KRexoxGqsLCfFQx9tDA7qQ+Gr3sJAhZvyOybN+ZHtrAfmSJEyYAFor+PI3okizlvc8siZKfh+5THrUhvnQ==
Original-authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco dot Dijkstra at arm dot com;
References: <8DC571DDDE171B4094D3D33E9685917BD87078@DGGEMI529-MBX.china.huawei.com>

Hi Derek,

>> Note that memchr_strlen significantly outperforms the fastest strlen
>> on sizes larger than 256, so I don't think that using uminv to test
>> for zeroes is the fastest approach.
>
> Indeedly, but memchr_strlen really has poor performance before 256 bytes,

Well that means memchr can be sped up for small sizes. While it is more
complex than strlen, it shouldn't be significantly slower.

> and if we mix this method into current version, we may need a length count
> and judge it more than 256 bytes or not in each loop, is this way cheap?

That may be possible, eg. by unrolling the first 64-128 bytes and using a loop
optimized for throughput for anything larger (on the assumption that if a
string is larger than 128, it is likely much larger).

However my point was that while the uminv sequence is simple and small, it's not
the fastest, so ultimately we need to find an alternative sequence which works
better for all the generic string functions which search for a character (strlen, strnlen,
memchr, memrchr, rawmemchr, strchr, strnchr, strchrnul, strcpy, strncpy).

> And we think small size is more important for strlen.

Absolutely, handling small cases quickly is essential for all string functions.

Wilco

References:
- Re: [PATCH v2 2/2] aarch64: Optimized memcpy and memmove for Kunpeng processor
  - From: Zhangxuelei (Derek)

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]