This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] Count number of logical processors sharing L2 cache
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: Florian Weimer <fweimer at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Fri, 27 May 2016 15:00:42 -0700
- Subject: Re: [PATCH] Count number of logical processors sharing L2 cache
- Authentication-results: sourceware.org; auth=none
- References: <CAMe9rOoy2YaQTdyqZpQ3=ytDc5dywNshzHAN2ymN60=L5KwbiA at mail dot gmail dot com> <CAMe9rOoq8MNkX0GvoePQ-C51mfUr2ikrRJgqCZE0CoGoJEmOOw at mail dot gmail dot com> <d4cf36ee-f402-41fe-5108-e072b47f2399 at redhat dot com> <CAMe9rOpUuYboLH9WgyHH4HiBaSBXJ+uB=MPUft2S26N+wYJ9-A at mail dot gmail dot com> <76801b5c-7770-23a9-9b7c-4e44722247e1 at redhat dot com> <CAMe9rOqAuxZZ=gpd1zXvbRrsqjhT8G6C9WBbpwaqa65s=-ZTnQ at mail dot gmail dot com> <57449424 dot 1000009 at redhat dot com> <CAMe9rOpzKgS_uNz1ZFg-sUxCbuY4tFOy19eLjyFaO7UbNYUr1g at mail dot gmail dot com>
On Tue, May 24, 2016 at 2:35 PM, H.J. Lu <firstname.lastname@example.org> wrote:
> On Tue, May 24, 2016 at 10:49 AM, Carlos O'Donell <email@example.com> wrote:
>> On 05/24/2016 11:02 AM, H.J. Lu wrote:
>>> CAT applies to a specific thread/process. Cache sizes in glibc are applied
>>> to string/memory functions for all threads/processes. They both try to avoid
>>> over-using shared cache by a single thread/process. But they work at
>>> different levels and have different behaviors. Glibc also uses the cache size
>>> to decide when to use non-temporal store to avoid cache pollution and speed
>>> up writing a large amount of data..
>> Don't you mean that CAT applies to a core (and all of its logical cores)?
>> Might it be the case that a thread or process could be migrated by the linux kernel
>> between various cores configured with different CAT values and the glibc heuristics
>> could be poorly tuned for some of those cores?
>> As I see it the values computed by init_cacheinfo() are only average heuristics for
>> the core.
>> I agree that Florian has a point, that these values may become less useful in the
>> presence of the dynamically changing L3<->core partitioning enabled by CAT.
>> It is silly though to think that you would allow a thread or process to migrate
>> away from the CAT-tuned core. The design of CAT is such that you want to isolate
>> the tuned application to one ore more cores and use CAT to control the L3 allocation
>> for those cores.
> I checked with our kernel CAT implementer. CAT supports both
> processor and process.
>> In the case where you have a process pinned to a core, and CAT is used to limit the
>> L3 of that core, do the glibc heuristics computed in init_cacheinfo() match the
>> reality of the L3<->core allocation? Or would a lower L3 CAT-tuned value mean that
>> glibc would be mis-tuned for that core?
> CAT dedicates part of L3 cache to certain processor or process so
> that L3 cache is always available to them. Glibc tries not to take all
> L3 cache in memcpy/memset so thar L3 cache is available for other
> operations within the same process as well as to other processor/process.
> CAT and glibc work at different angels. There is no direct conflict between
> CAT and glibc. At the moment, I am not sure if CAT-aware glibc will
> improve performance.
I will check in my patch shortly.