Bug 30429 - Assume large caches for string function tuning if x86 cache size information is missing
Summary: Assume large caches for string function tuning if x86 cache size information ...
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: string (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: 2.38
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-05-08 17:09 UTC by Florian Weimer
Modified: 2023-06-04 00:04 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fw: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Weimer 2023-05-08 17:09:39 UTC
If we cannot get cache sizes from CPUID, we should not use string functions that depend on cache sizes, in particular the non-temporal store threshold.

After commit 48b74865c63840b288bd85b4d8743533b73b339b("x86: Check minimum/maximum of non_temporal_threshold [BZ #29953]") we clamp the the minimum non-temporal store threshold at 0x4040, but this is at the extremely low end likely bad for performance.
Comment 1 Noah Goldstein 2023-05-08 17:45:48 UTC
(In reply to Florian Weimer from comment #0)
> If we cannot get cache sizes from CPUID, we should not use string functions
> that depend on cache sizes, in particular the non-temporal store threshold.
> 
> After commit 48b74865c63840b288bd85b4d8743533b73b339b("x86: Check
> minimum/maximum of non_temporal_threshold [BZ #29953]") we clamp the the
> minimum non-temporal store threshold at 0x4040, but this is at the extremely
> low end likely bad for performance.

We probably would be better off using the maximum and skipping non-temporal threshold all together.
Comment 2 H.J. Lu 2023-05-08 19:43:34 UTC
(In reply to Noah Goldstein from comment #1)
> (In reply to Florian Weimer from comment #0)
> > If we cannot get cache sizes from CPUID, we should not use string functions
> > that depend on cache sizes, in particular the non-temporal store threshold.
> > 
> > After commit 48b74865c63840b288bd85b4d8743533b73b339b("x86: Check
> > minimum/maximum of non_temporal_threshold [BZ #29953]") we clamp the the
> > minimum non-temporal store threshold at 0x4040, but this is at the extremely
> > low end likely bad for performance.
> 
> We probably would be better off using the maximum and skipping non-temporal
> threshold all together.

It sounds reasonable.
Comment 3 Florian Weimer 2023-05-08 19:47:13 UTC
(In reply to Noah Goldstein from comment #1)
> (In reply to Florian Weimer from comment #0)
> > If we cannot get cache sizes from CPUID, we should not use string functions
> > that depend on cache sizes, in particular the non-temporal store threshold.
> > 
> > After commit 48b74865c63840b288bd85b4d8743533b73b339b("x86: Check
> > minimum/maximum of non_temporal_threshold [BZ #29953]") we clamp the the
> > minimum non-temporal store threshold at 0x4040, but this is at the extremely
> > low end likely bad for performance.
> 
> We probably would be better off using the maximum and skipping non-temporal
> threshold all together.

True, that's probably better than switching implementations. Please feel to retitle this bug as appropriate.
Comment 4 Noah Goldstein 2023-05-09 03:11:17 UTC
Posted patch to make default in this case 64MB
https://patchwork.sourceware.org/project/glibc/list/?series=19713
Comment 5 Noah Goldstein 2023-05-28 15:42:06 UTC
Fix pushed.
Comment 6 Florian Weimer 2023-05-28 19:19:57 UTC
Are you going to backport this?

Fixed by:

commit ed2f9dc9420c4c61436328778a70459d0a35556a
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date:   Mon May 8 22:10:20 2023 -0500

    x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
    
    If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
    it almost certainly means we failed to read the systems cache info.
    
    In this case, rather than defaulting the minimum correct value, we
    should default to a value that gets at least reasonable
    performance. 64MB is chosen conservatively to be at the very high
    end. This should never cause non-temporal stores when, if we had read
    cache info, we wouldn't have otherwise.
    Reviewed-by: Florian Weimer <fweimer@redhat.com>
Comment 7 Noah Goldstein 2023-06-04 00:04:38 UTC
(In reply to Florian Weimer from comment #6)
> Are you going to backport this?
> 

Didn't see this question earlier.

Sure I can backport. You think to 2.28?


> Fixed by:
> 
> commit ed2f9dc9420c4c61436328778a70459d0a35556a
> Author: Noah Goldstein <goldstein.w.n@gmail.com>
> Date:   Mon May 8 22:10:20 2023 -0500
> 
>     x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
>     
>     If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
>     it almost certainly means we failed to read the systems cache info.
>     
>     In this case, rather than defaulting the minimum correct value, we
>     should default to a value that gets at least reasonable
>     performance. 64MB is chosen conservatively to be at the very high
>     end. This should never cause non-temporal stores when, if we had read
>     cache info, we wouldn't have otherwise.
>     Reviewed-by: Florian Weimer <fweimer@redhat.com>