This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

glibc.cpu.cached_memopt (was Re: [PATCH] Rename the glibc.tune namespace to glibc.cpu)

From: Siddhesh Poyarekar <siddhesh at sourceware dot org>
To: Tulio Magno Quites Machado Filho <tuliom at ascii dot art dot br>, Carlos O'Donell <carlos at redhat dot com>, libc-alpha at sourceware dot org, "H.J. Lu" <hjl dot tools at gmail dot com>
Date: Tue, 17 Jul 2018 07:30:33 +0530
Subject: glibc.cpu.cached_memopt (was Re: [PATCH] Rename the glibc.tune namespace to glibc.cpu)
References: <20180716141633.6948-1-siddhesh@sourceware.org> <902a4076-7b87-ea27-bab4-3740ab0a04ec@redhat.com> <25d88c07-1e8f-bd73-cc28-989930a55933@sourceware.org> <87tvozx83k.fsf@linux.ibm.com>

On 07/17/2018 01:32 AM, Tulio Magno Quites Machado Filho wrote:

I'm not following your line of thought here:

  - glibc.cpu.hwcaps is specific to i386 and x86-64
  - glibc.cpu is specific to aarch64
  - glibc.cpu.cached_memopt is specific to powerpc, powerpc64 and powerpc64le

What am I missing?

The difference is that glibc.cpu.name and glibc.cpu.hwcaps areconceptually generic tunables, i.e. there is a reasonable chance thatcouple of releases down the line another architecture may want toprovide tuning facility for CPUs by name or by HWCAPS. Thecached_memopt one is not very clear to me and seems more like somethingthat is only useful on power8. x86-specific tunables i,e, where theconcept is not currently applicable for other architectures(x86_l2_temporal_threshold) are prefixed with x86_*.

Notice the optimization is not specific to a CPU, but specific to an user
scenario (cacheable memory).  In other words, the optimization can't be used
whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when
cache-inhibited memory is being used.

Ahh OK, I got thrown off by the fact that there's a separate routine forit and assumed that it is Power8-specific. I have a different concernthen; a tunable is process-wide so the cached_memopt tunable essentiallyassumes that the entire process is using cache-inhibited memory. Isthat a reasonable assumption? In my experience a typical process wouldhave only a set of structures in cache-inhibited memory and most of itwould be regular memory. In that sense it looks more like a tradeoffhack and it would be nice to consider alternatives. Here are a couple Ican think of off the top of my head:

1. A new relocation that overlays on top of ifuncs and allows selectionof routines based on specific properties. I have had this idea for awhile but no time to implement it and it has much more general scopethan memory type; for example memory alignment could also be a factor toshort-cut parts of string routines at compile time itself. It does nothave the runtime flexibility of a tunable but is probably far moreconfigurable.

2. If there is a correlation to size then implement something similar tothe x86 temporal_threshold tunable. This is probably just as good orbad as setting a cached_memopt flag but has the effect of generalizingwhat was a tunable.


What do you think?

Siddhesh

References:
- [PATCH] Rename the glibc.tune namespace to glibc.cpu
  - From: Siddhesh Poyarekar
- Re: [PATCH] Rename the glibc.tune namespace to glibc.cpu
  - From: Carlos O'Donell
- Re: [PATCH] Rename the glibc.tune namespace to glibc.cpu
  - From: Siddhesh Poyarekar
- Re: [PATCH] Rename the glibc.tune namespace to glibc.cpu
  - From: Tulio Magno Quites Machado Filho

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]