[PATCH] x86: Fix for cache computation on AMD legacy cpus.

Mon Jun 5 18:59:19 GMT 2023

* sajan karumanchi:

> From: Sajan Karumanchi <sajan.karumanchi@amd.com>
>
> Some legacy AMD CPUs and hypervisors have the _cpuid_ '0x8000_001D'
> set to Zero, thus resulting in zeroed-out computed cache values.
> This patch reintroduces the old way of cache computation as a
> failsafe option to handle these exceptions.
>
> Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com>
> ---
>  sysdeps/x86/dl-cacheinfo.h | 218 +++++++++++++++++++++++++++++++++----
>  1 file changed, 194 insertions(+), 24 deletions(-)

On a Phenom II X6 1055T CPU, I see this difference compared to what we
had before:

@@ -212,7 +211,7 @@
 x86.cpu_features.level2_cache_size=0x80000
 x86.cpu_features.level2_cache_assoc=0x10
 x86.cpu_features.level2_cache_linesize=0x40
-x86.cpu_features.level3_cache_size=0x600000
+x86.cpu_features.level3_cache_size=0xc0000
 x86.cpu_features.level3_cache_assoc=0x30
 x86.cpu_features.level3_cache_linesize=0x40
 x86.cpu_features.level4_cache_size=0xffffffffffffffff

According to Wikipedia, L3 is shared, so I would have expected 0x100000
here with the correction factor.  The 0xc0000 number we had before seems
wrong, too.

On what appears to be a two-socket 12-core Magny-Cours prototype, I get
this:

@@ -212,7 +211,7 @@
 x86.cpu_features.level2_cache_size=0x80000
 x86.cpu_features.level2_cache_assoc=0x10
 x86.cpu_features.level2_cache_linesize=0x40
-x86.cpu_features.level3_cache_size=0xa00000
+x86.cpu_features.level3_cache_size=0xa0000
 x86.cpu_features.level3_cache_assoc=0x60
 x86.cpu_features.level3_cache_linesize=0x40
 x86.cpu_features.level4_cache_size=0xffffffffffffffff

According to Wikipedia, each socket has two chips with 6 MiB shared L3
cache each, which means that the system has a total of 24 MiB shared L3
cache.  With the per-thread correction factor, should we get 1 MiB L3
cache?

I don't know how far away either CPU is from the production silicon.
The Phenom CPU at least has the model name field filled in properly.

I'm not sure how to proceed here.  Should I try to get more complete
CPUID dumps?  The data that is used for cache computation mostly doesn't
make it into the --list-diagnostics output.

Thanks,
Florian