Bug 30429

Summary: Assume large caches for string function tuning if x86 cache size information is missing
Product: glibc Reporter: Florian Weimer <fweimer>
Component: stringAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: fw, fweimer, goldstein.w.n, hjl.tools
Priority: P2 Flags: fw: security-
Version: unspecified   
Target Milestone: 2.38   
See Also: https://sourceware.org/bugzilla/show_bug.cgi?id=29953
Host: Target:
Build: Last reconfirmed:

Description Florian Weimer 2023-05-08 17:09:39 UTC
If we cannot get cache sizes from CPUID, we should not use string functions that depend on cache sizes, in particular the non-temporal store threshold.

After commit 48b74865c63840b288bd85b4d8743533b73b339b("x86: Check minimum/maximum of non_temporal_threshold [BZ #29953]") we clamp the the minimum non-temporal store threshold at 0x4040, but this is at the extremely low end likely bad for performance.
Comment 1 Noah Goldstein 2023-05-08 17:45:48 UTC
(In reply to Florian Weimer from comment #0)
> If we cannot get cache sizes from CPUID, we should not use string functions
> that depend on cache sizes, in particular the non-temporal store threshold.
> 
> After commit 48b74865c63840b288bd85b4d8743533b73b339b("x86: Check
> minimum/maximum of non_temporal_threshold [BZ #29953]") we clamp the the
> minimum non-temporal store threshold at 0x4040, but this is at the extremely
> low end likely bad for performance.

We probably would be better off using the maximum and skipping non-temporal threshold all together.
Comment 2 H.J. Lu 2023-05-08 19:43:34 UTC
(In reply to Noah Goldstein from comment #1)
> (In reply to Florian Weimer from comment #0)
> > If we cannot get cache sizes from CPUID, we should not use string functions
> > that depend on cache sizes, in particular the non-temporal store threshold.
> > 
> > After commit 48b74865c63840b288bd85b4d8743533b73b339b("x86: Check
> > minimum/maximum of non_temporal_threshold [BZ #29953]") we clamp the the
> > minimum non-temporal store threshold at 0x4040, but this is at the extremely
> > low end likely bad for performance.
> 
> We probably would be better off using the maximum and skipping non-temporal
> threshold all together.

It sounds reasonable.
Comment 3 Florian Weimer 2023-05-08 19:47:13 UTC
(In reply to Noah Goldstein from comment #1)
> (In reply to Florian Weimer from comment #0)
> > If we cannot get cache sizes from CPUID, we should not use string functions
> > that depend on cache sizes, in particular the non-temporal store threshold.
> > 
> > After commit 48b74865c63840b288bd85b4d8743533b73b339b("x86: Check
> > minimum/maximum of non_temporal_threshold [BZ #29953]") we clamp the the
> > minimum non-temporal store threshold at 0x4040, but this is at the extremely
> > low end likely bad for performance.
> 
> We probably would be better off using the maximum and skipping non-temporal
> threshold all together.

True, that's probably better than switching implementations. Please feel to retitle this bug as appropriate.
Comment 4 Noah Goldstein 2023-05-09 03:11:17 UTC
Posted patch to make default in this case 64MB
https://patchwork.sourceware.org/project/glibc/list/?series=19713
Comment 5 Noah Goldstein 2023-05-28 15:42:06 UTC
Fix pushed.
Comment 6 Florian Weimer 2023-05-28 19:19:57 UTC
Are you going to backport this?

Fixed by:

commit ed2f9dc9420c4c61436328778a70459d0a35556a
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date:   Mon May 8 22:10:20 2023 -0500

    x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
    
    If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
    it almost certainly means we failed to read the systems cache info.
    
    In this case, rather than defaulting the minimum correct value, we
    should default to a value that gets at least reasonable
    performance. 64MB is chosen conservatively to be at the very high
    end. This should never cause non-temporal stores when, if we had read
    cache info, we wouldn't have otherwise.
    Reviewed-by: Florian Weimer <fweimer@redhat.com>
Comment 7 Noah Goldstein 2023-06-04 00:04:38 UTC
(In reply to Florian Weimer from comment #6)
> Are you going to backport this?
> 

Didn't see this question earlier.

Sure I can backport. You think to 2.28?


> Fixed by:
> 
> commit ed2f9dc9420c4c61436328778a70459d0a35556a
> Author: Noah Goldstein <goldstein.w.n@gmail.com>
> Date:   Mon May 8 22:10:20 2023 -0500
> 
>     x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
>     
>     If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
>     it almost certainly means we failed to read the systems cache info.
>     
>     In this case, rather than defaulting the minimum correct value, we
>     should default to a value that gets at least reasonable
>     performance. 64MB is chosen conservatively to be at the very high
>     end. This should never cause non-temporal stores when, if we had read
>     cache info, we wouldn't have otherwise.
>     Reviewed-by: Florian Weimer <fweimer@redhat.com>