This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] Count number of logical processors sharing L2 cache
- From: Carlos O'Donell <carlos at redhat dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>, Florian Weimer <fweimer at redhat dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 24 May 2016 13:49:24 -0400
- Subject: Re: [PATCH] Count number of logical processors sharing L2 cache
- Authentication-results: sourceware.org; auth=none
- References: <CAMe9rOoy2YaQTdyqZpQ3=ytDc5dywNshzHAN2ymN60=L5KwbiA at mail dot gmail dot com> <CAMe9rOoq8MNkX0GvoePQ-C51mfUr2ikrRJgqCZE0CoGoJEmOOw at mail dot gmail dot com> <d4cf36ee-f402-41fe-5108-e072b47f2399 at redhat dot com> <CAMe9rOpUuYboLH9WgyHH4HiBaSBXJ+uB=MPUft2S26N+wYJ9-A at mail dot gmail dot com> <76801b5c-7770-23a9-9b7c-4e44722247e1 at redhat dot com> <CAMe9rOqAuxZZ=gpd1zXvbRrsqjhT8G6C9WBbpwaqa65s=-ZTnQ at mail dot gmail dot com>
On 05/24/2016 11:02 AM, H.J. Lu wrote:
> CAT applies to a specific thread/process. Cache sizes in glibc are applied
> to string/memory functions for all threads/processes. They both try to avoid
> over-using shared cache by a single thread/process. But they work at
> different levels and have different behaviors. Glibc also uses the cache size
> to decide when to use non-temporal store to avoid cache pollution and speed
> up writing a large amount of data..
Don't you mean that CAT applies to a core (and all of its logical cores)?
Might it be the case that a thread or process could be migrated by the linux kernel
between various cores configured with different CAT values and the glibc heuristics
could be poorly tuned for some of those cores?
As I see it the values computed by init_cacheinfo() are only average heuristics for
I agree that Florian has a point, that these values may become less useful in the
presence of the dynamically changing L3<->core partitioning enabled by CAT.
It is silly though to think that you would allow a thread or process to migrate
away from the CAT-tuned core. The design of CAT is such that you want to isolate
the tuned application to one ore more cores and use CAT to control the L3 allocation
for those cores.
In the case where you have a process pinned to a core, and CAT is used to limit the
L3 of that core, do the glibc heuristics computed in init_cacheinfo() match the
reality of the L3<->core allocation? Or would a lower L3 CAT-tuned value mean that
glibc would be mis-tuned for that core?