Chips derived from McKinley-core (Itanium 2, etc.) have an anomaly which can cause stalls if an F-unit instruction (including a NOP) is issued right after reading certain application registers (such as ar.bsp). Furthermore, power-considerations also argue against the use of F-unit instructions unless they're really needed. Similarly, using B-unit NOPs is probably not a great idea for McKinley-derived cores: certain templates with B unit instructions cause split-issue and the BBB template in particular causes a branch-prediction anomaly (the first two branches are predicated based on the slot 0 hints. I'll attach a patch which appears to fix this problem. If it looks OK, please check it in. Note: the new policy won't be optimal for Itanium (Merced) chips. I don't think anyone will care.
Created attachment 445 [details] Proposed fix
Jim, is the proposed patch OK?
The patch looks OK to me, but I expect it will break the IA-64 gas testsuite, which means we need patches for that also. I haven't tried writing those patches yet.
Can we modify the patch to have it a switch to turn it on/off and make it on by default? Then, we can just turn it off in the existing testcases and add a new one for this.
Subject: Re: gas should avoid F-unit NOPs (and B-unit probably, too) On Thu, 2005-03-24 at 18:22, hjl at lucon dot org wrote: > ------- Additional Comments From hjl at lucon dot org 2005-03-25 02:22 ------- > Can we modify the patch to have it a switch to turn it on/off and make it > on by default? Then, we can just turn it off in the existing testcases and > add a new one for this. Such as a -mcpu=itanium1 switch? I suppose that could work. Or we could try fixing the testsuite to use explicit bundles, so that it stops changing every time the bundling heuristics change. Or we could just say to heck with it and regenerate all of the .d files. Or we could try to make the .d files be bundling independent. Try looking at some of the ones that Jan Beulich has added. He uses "." to match bundles when nops are involved, so that it is bundling independent. See for instance align.d. However, before we make a decision, I really think that someone should run the gas testsuite first, and check to see whether we have a problem, and if so, how big of a problem it is. There is no sense to trying to write a complicated solution if we only have a simple problem.
With the patch applied, I got ./build-ia64-linux/gas/testsuite/gas.log:FAIL: ia64 opc-b ./build-ia64-linux/gas/testsuite/gas.log:FAIL: ia64 opc-f ./build-ia64-linux/gas/testsuite/gas.log:FAIL: ia64 opc-i ./build-ia64-linux/gas/testsuite/gas.log:FAIL: ia64 opc-m ./build-ia64-linux/gas/testsuite/gas.log:FAIL: ia64 pseudo-ops ./build-ia64-linux/gas/testsuite/gas.log:FAIL: gas/ia64/dv-imply ./build-ia64-linux/gas/testsuite/gas.log:FAIL: gas/ia64/dv-mutex ./build-ia64-linux/gas/testsuite/gas.log:FAIL: gas/ia64/dv-safe ./build-ia64-linux/gas/testsuite/gas.log:FAIL: gas/ia64/dv-srlz ./build-ia64-linux/gas/testsuite/gas.log:FAIL: ia64 tls ./build-ia64-linux/gas/testsuite/gas.log:FAIL: ia64 ldxmov-1 ./build-ia64-linux/gas/testsuite/gas.log:FAIL: ia64 pcrel ./build-ia64-linux/gas/testsuite/gas.log:FAIL: ia64 operand-or ./build-ia64-linux/ld/ld.log:FAIL: TLS -fpic -shared ./build-ia64-linux/ld/ld.log:FAIL: TLS -fpic and -fno-pic exec I think we should add -mcpu=[itanium1|itanium2] and make itanium2 the default.
Subject: Re: gas should avoid F-unit NOPs (and B-unit probably, too) On Fri, 2005-03-25 at 09:18, hjl at lucon dot org wrote: > I think we should add -mcpu=[itanium1|itanium2] and make itanium2 the default. I'm OK with that. Long term, I think writing testcases the way Jan did is a better solution, but we've got 2k nops in just the opc-*.d files alone, and fixing them all seems like too much work for now when we are just trying to get David's patch in.
Created attachment 447 [details] A patch to tune for itanium1 and itanium2 This patch adds "-mtune=[itanium1|itanium2]". I will update NEWS and doc if it is OK.
Subject: Re: gas should avoid F-unit NOPs (and B-unit probably, too) On Mon, 2005-03-28 at 13:03, hjl at lucon dot org wrote: > This patch adds "-mtune=[itanium1|itanium2]". I will update > NEWS and doc if it is OK. I think the extra_goodness change is a little confusing. You are reusing the parameter "slot" for the return value, even though the return value has nothing to do with slot numbers. I had to read the code twice before I figured out why you were changing the slot numbers. I would prefer a new local variable named "goodness" or "retval" or whatever. And then we can initialize it to a sensible value like 0. Or perhaps just "return 2" instead of "slot = 2" like the original code had. There is also the issue that maybe there should be a "default: abort ();" to make sure the code gets appropriately modified when future processors are added. Otherwise, it all looks OK to me.
Created attachment 448 [details] An updated patch This is an updated patch. I documented the change and modified extra_goodness.
Subject: Re: gas should avoid F-unit NOPs (and B-unit probably, too) On Mon, 2005-03-28 at 13:48, hjl at lucon dot org wrote: > This is an updated patch. I documented the change and modified > extra_goodness. OK. Thanks.
HJ checked in his patch to fix this.