I have a machine for which /proc/cpuinfo shows this: vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5670 @ 2.93GHz stepping : 2 and "getconf -a | grep CACHE" shows zero for all the matching entries. For what its worth, __cpuid(2, eax, ebx, ecx, edx) puts 0x5b0b101, 0x5657f0, 0x0 and 0x2cb43049 in eax, ebx, ecx and edx respectively and these do not correspond to any entries in intel_02_known[] in sysdeps/x86_64/cacheinfo.c or sysdeps/i386/i686/cacheinfo.c. The values in /sys/devices/system/cpu/cpuN/cache/index*/* seem to be correct for this machine.
My apologies: those CPUID results are for a different CPU, the actual problem ones are 0x55035a01 0xf0b2ff 0x0 0xca0000.
I've implemented something without the ability to test it. Check it and reopen if necessary.
The fix doesn't work since the loop may not terminate. It goes into an infinite loop on Xeon X5670. Also we should use __cpuid_count like the rest of sysdeps/x86_64/cacheinfo.c.
Created attachment 5322 [details] A patch
I've checked in a simpler patch. The only problem was the missing increment.
I moved the cpuid4 code to handle_intel() so that it is used whenever the cpuid4 is available. This seems to be a better place to put the test because it avoids a cpuid2 which is not going to be useful and it also works on a greater range of hardware which makes it a bit easier to test (the Xeon X5670 I have access to is on a different continent). I also replaced the asm with __cpuid_count: the previous version didn't use the macro and so didn't take into account the "#if defined(__i386__) && defined(__PIC__)" that the macro definition uses.
Created attachment 5324 [details] A new patch Use cpuid4 in preference to cpuid2 when possible.
Any comments on the updated patch, Ulrich? I've had it working on a Xeon 5670: # getconf -a | grep -i cache LEVEL1_ICACHE_SIZE 32768 LEVEL1_ICACHE_ASSOC 4 LEVEL1_ICACHE_LINESIZE 64 LEVEL1_DCACHE_SIZE 32768 LEVEL1_DCACHE_ASSOC 8 LEVEL1_DCACHE_LINESIZE 64 LEVEL2_CACHE_SIZE 262144 LEVEL2_CACHE_ASSOC 8 LEVEL2_CACHE_LINESIZE 64 LEVEL3_CACHE_SIZE 12582912 LEVEL3_CACHE_ASSOC 16 LEVEL3_CACHE_LINESIZE 64 LEVEL4_CACHE_SIZE 0 LEVEL4_CACHE_ASSOC 0 LEVEL4_CACHE_LINESIZE 0 and I've also carefully tested the cpuid4 vs cpuid2 thing on a variety of machines (including checking that the cpuid2 code still works).
I think I forgot to mention: using cpuid4 will mean an end to updates to the intel_02_known table: any new processors support cpuid4 so even if there are cache descriptors for cpuid2 for them, you won't need to add them.
For your information, this error is the same error I reported in Debian some time ago: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=609389 Debian eglibc 2.13-0exp5, which includes Ulrich's patch, fixed problem in my case. Thanks.
(In reply to comment #10) > For your information, this error is the same error I reported in Debian some > time ago: > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=609389 > > Debian eglibc 2.13-0exp5, which includes Ulrich's patch, fixed problem in my > case. This is highly surprising. That bug is for a machine whose cpuid level is 2 so the cpuid 4 instruction that is in both Ulrich's original patch and my variation on it would not be used. If Ulrich's patch fixed this then you must be using it on some different processor. That bug would appear to be as a result of missing entries in the intel_02_known table which would have been added for that processor quite some time ago. The code re-ordering in my variation of Ulrich's fix has the advantage that the intel_02_known table need never be updated again (assuming that it covers all the old CPUs that have cpuid level < 4). I know that some recent processorswith 24 way associative caches needed an update to this table, but those same processors would also have a cpuid level > 4 and so had we been using cpuid 4 deterministic cache parameters then the table would not have needed to be extended for them. For that old Pentium M processor, though, none of the patches referred to here would have made any difference at all. That's right isn't it Ulrich?
Intel's documentation doesn't say that leaf 4 is guaranteed to contain the cache information if the largest standard function number is >= 4. It only says the data is available that leaf can be queried and that if leaf 2 indicates that leaf 4 contains the data that it can be found there. If you want the patch to be added you have to get Intel to clarify their documentation.
(In reply to comment #12) > If you want the patch to be added you have to get Intel to clarify their > documentation. I have three references for you: the commit for the corresponding code in the linux kernel and two extracts from the "Intel(R) 64 and IA-32 Architectures Software Developer's Manual, Volume 2A". The original commit for the kernel code is in the historic kernel repo (git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git): ------------------------------------------------------------------------- commit 7b502b56175499c472103e1d99346d3b5de7d53f Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Wed Mar 30 16:34:11 2005 -0800 [PATCH] x86, x86_64: reading deterministic cache parameters and exporting it in /sysfs The attached patch adds support for using cpuid(4) instead of cpuid(2), to get CPU cache information in a deterministic way for Intel CPUs, whenever supported. The details of cpuid(4) can be found here IA-32 Intel Architecture Software Developer's Manual (vol 2a) (http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a) and Prescott New Instructions (PNI) Technology: Software Developer's Guide (http://www.intel.com/cd/ids/developer/asmo-na/eng/events/43988.htm) The advantage of using the cpuid(4) ('Deterministic Cache Parameters Leaf') are: - It provides more information than the descriptors provided by cpuid(2) - It is not table based as cpuid(2). So, we will not need changes to the kernel to support new cache descriptors in the descriptor table (as is the case with cpuid(2)). The patch also adds a bunch of interfaces under /sys/devices/system/cpu/cpuX/cache, showing various information about the caches. Most useful field being shared_cpu_map, which says what caches are shared among which logical cpus. The patch adds support for both i386 and x86-64. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> ------------------------------------------------------------------------- I realise that this is merely a corroborating precedent rather than a definitive statement even though it did originate from Intel a few years ago. However in my copy of the software developer's manual (Order Number: 253666-033US, December 2009) I find this in under the description of the CPUID instruction. First on 3-320 describing the result from "INPUT EAX = 2: TLB/Cache/Prefetch Information Returned in EAX, EBX, ECX, EDX": ------------------------------------------------------------------------- Note also a processor may report a general descriptor type (FFH) and not report any byte descriptor of “cache type“ via CPUID leaf 2. ------------------------------------------------------------------------- and under "INPUT EAX = 04H: Returns Deterministic Cache Parameters for Each Level": ------------------------------------------------------------------------- Software can enumerate the deterministic cache parameters for each level of the cache hierarchy starting with an index value of 0, until the parameters report the value associated with the cache type field is 0. ------------------------------------------------------------------------- Note that "deterministic" is used here to provide mean that the cache parameters are looked up directly rather than having to consult an external table (not parameters for a "deterministic cache"). This seems unambiguous: cpuid-4 enumerates each level of the cache hierarchy, not just some parts of it. The 0xFF from cpuid-2 simply says that there is no cache information provided here (and indeed, the problematic Xeon 5670 that started this only has descriptors for TLBs). The fact that the kernel has been using this scheme for six years now and that the original code was contributed by Intel corroborates this view. H.J. Lu: can you respond if you disagree please? Ulrich: in view, especially, of the second of the advantages cited by the commit above, can we have this in glibc please?
None of that means at all that a maximum function number >= 4 implies that caches are guaranteed to be reported in leaf 4. Why are you wasting my time if you have nothing to show but completely irrelevant data? The current code works and unless Intel really screws up is guaranteed to work.
(In reply to comment #14) > None of that means at all that a maximum function number >= 4 implies that > caches are guaranteed to be reported in leaf 4. Why are you wasting my time if > you have nothing to show but completely irrelevant data? The current code > works and unless Intel really screws up is guaranteed to work. I'm sorry, Ulrich, but "Software can enumerate the deterministic cache parameters for each level of the cache hierarchy[...]" means exactly that; it does not mean that only some cache parameters are enumerated, now or in the future. I know that you can spot ambiguity where I cannot (I have seen the evidence of that in the past, but I do not believe that you have seen something that the rest of us have missed this time.) It does mean, however, that some current or planned CPU that does not have its cache parameters described by the intel_02_known[] table will fail whereas use of cpuid-4 will continue to work. However, if you would prefer to check the intel_02_known[] table for each new cpu that Intel produce that it your choice, but personally I would prefer to fix this problem just once.
There is nothing to fix here. No further maintenance effort needed, nothing. And your interpretation of what the manual says is far too optimistic. Don't reopen this bug, the code works..
(In reply to comment #8) > [...] I've had it working on a Xeon > 5670: > > # getconf -a | grep -i cache > LEVEL1_ICACHE_SIZE 32768 > [...] For other readers of this bug: this is the test result of the new patch: I did not test the one checked in at all.
*** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.