This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] aarch64: Add tunable glibc.memset.dc_zva_threshold
- From: Feng Xue OS <fxue at os dot amperecomputing dot com>
- To: Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>
- Cc: nd <nd at arm dot com>
- Date: Mon, 29 Jul 2019 02:25:31 +0000
- Subject: Re: [PATCH] aarch64: Add tunable glibc.memset.dc_zva_threshold
- Arc-authentication-results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=os.amperecomputing.com;dmarc=pass action=none header.from=os.amperecomputing.com;dkim=pass header.d=os.amperecomputing.com;arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=p2VogNIpIpufcaj8pkvBF8i3hrM/TC9xyuJPlo+yFow=; b=TaHcf90W9a2Funsx8PN14vk1tVfsQhJ1hamWYQ8xWbFYvJhwExEm8sOxz4qufzG1RroJnTzSzeLegdK9NMB4CxvJRwNT8jBWJtrQcF4vWu2KOiqmyD5iTlxhFpLKG1cnX+DBLL3TDjmaa8y4pFbiHQGLmcSjO+yKPeRMZ1FMZb1XwONZraYmGEk+MQMee7xGTXcDWAu6CiUq7TX+oeGv+5oUfTejw5W/n/9FuHRgVllwJ7+DZxPmPZrdk0JqcUVipmJbhniy7cbpIGHx8VC+U1zk98cYx3BI8ZPMrwAyXZ+vpX7t8kToFM4kXDNvxtXK2Vz1ibjsk/h2My7+dQlZrQ==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OLQmcWFkcBNnSnnMM47PR3niIpbuVGHbL0zFb9pSh8L9so53NUC1LhqjNV6ZcbHwcphlwZJA9Meofz+V+l1wzJaoBZU2ef0ixjpK0TL6ql+p7v5jgM0Al7EnEibCXDHkGqvrHgu2B2w2TPQ6UKdf27Ub7ThebfO7fXlEKrUUCr7wva0hSkbiaHh9F4YqrCgzbSsmKOfvi53pNQr/2OxDbgTVxDh4v1dw3T823aeLvmX9Q0NU0t6Gwo2cm/oc/ObTuXywloPCSSztaLrLVCeZJ4WfC6Buh/vlBPhEP82nqB61auNX44YlieG+gs5GfLa4Alp4okRTHAGWRqZsQMZi5g==
- References: <BYAPR01MB4869B921E04ECAEF4264B926F7C00@BYAPR01MB4869.prod.exchangelabs.com>,<16c3d8e1-3fc6-5b98-0d6b-c5adb76f9eec@arm.com>
Yes. For multiple parallel threading workload on emag, we can get an obvious
improvement if using a large threshold to trigger DC ZVA or even disable it, while
for single threading, situation is reversed. Since there is no smart way to identify
workload characteristic at runtime, we propose to introduce this tunable.
Feng
________________________________________
From: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
Sent: Friday, July 26, 2019 11:17:39 PM
To: Feng Xue OS; libc-alpha@sourceware.org
Cc: nd
Subject: Re: [PATCH] aarch64: Add tunable glibc.memset.dc_zva_threshold
On 26/07/2019 12:58, Feng Xue OS wrote:
> This patch is composed to add a tunable 'glibc.memset.dc_zva_threshold'
> to control using DC ZVA in memset or not. Only when memset size exceeds
> this threshold, DC ZVA will be used.
>
> The background is that DC ZVA does not always outperform normal
> memory-store zeroing, especially when there are multiple processes/threads
> contending for memory/cache.
adding a threshold to memset_emag is fine, but
i'm not yet convinced that a tunable threshold
is useful enough.
is it expected that different workloads require
different setting? is this effect significant?