Created attachment 12322 [details] Code and binaries that display this bug. Hi Everyone, While attempting to help bootstrap a recent version of gcc on HP-UX IA-64 platform (GCC bug 61577 comments 202, 203 and 204), we have hit a PCREL60B reloc related issue. The following tools were used: - HP-UX B.11.31 (IA-64) - GCC 4.7.4 - Binutils 2.32 (but this happens with 2.34 as well) - 92453-07 linker ld HP Itanium(R) B.12.65 IPF/IPF In the attachment binutils-gas-PCREL60B-bug.tgz you will find all the code as well as binaries used to trigger and demonstrate this issue. The source main-brl.s was generated by taking the output of compiling main.c with -save-temps option and changing br.call to brl.call, this source is presented below: ------------------------------ .file "main.c" .pred.safe_across_calls p1-p5,p16-p63 .section .text, "ax", "progbits" .align 16 .global main# .type main#, @function .proc main# main: .prologue 14, 32 .save ar.pfs, r33 alloc r33 = ar.pfs, 0, 4, 0, 0 .vframe r34 mov r34 = r12 .save rp, r32 mov r32 = b0 mov r35 = r1 .body ;; brl.call.sptk.many b0 = hello# mov r1 = r35 mov r14 = r0 ;; mov r8 = r14 mov ar.pfs = r33 mov b0 = r32 .restore sp mov r12 = r34 br.ret.sptk.many b0 ;; .endp main# .global hello# .type hello#, @function .ident "GCC: (GNU) 4.7.4" ------------------------------ When assembling main-brl.s we get the expected PCREL60B relocation but at the wrong offset (0x21 instead of 0x22): ------------------------------ 20: 04 00 00 00 01 00 [MLX] nop.m 0x0 21: PCREL60B hello 26: 00 00 00 00 00 00 brl.call.sptk.many b0=20 <main+0x20> ------------------------------ If we now link this object into the final executable HP's linker clobbers the nop.m at 0x20 due to the wrong reloc offset as shown below: ------------------------------ 4000930: 04 00 00 00 00 00 [MLX] break.m 0x0 4000936: 00 30 00 00 00 00 brl.call.sptk.many b0=4000930 <_end+0xc3ff08f0> 400093c: 08 00 00 d0 ------------------------------ This seems to be the only issue here. To verify I checked the 60 bit immediate that gets encoded and it seems to be correct. I have also manually patched main-brl.o to change the reloc offset from 0x21 to 0x22 and relinked as shown below: ------------------------------ # objdump -rd main-brl-patched.o 20: 04 00 00 00 01 00 [MLX] nop.m 0x0 22: PCREL60B hello 26: 00 00 00 00 00 00 brl.call.sptk.many b0=20 <main+0x20> # objdump -rd hello-brl-patched 4000930: 04 00 00 00 01 00 [MLX] nop.m 0x0 4000936: 00 00 00 00 00 00 brl.call.sptk.many b0=4000990 <hello> 400093c: 68 00 00 d0 ------------------------------ As can be seen above the relocation is now correct and the executable works as expected. Please let me know if further info is required. Thank you! --peter
(In reply to Peter Bisroev from comment #0) Hi Peter, I am afraid that I am totally unfamiliar with the IA64 ABI, so if my questions below do not make sense, then please forgive me: 1. Is it the case that all instances of the PCREL60B reloc are wrong or is it only under certain circumstances ? If it is under certain circumstances, then what are they ? 2. Are other, similar relocs affected ? 3. Does this problem only apply to the HP-UX targeted version of the IA64 assembler, or does it apply to other versions too ? eg ia64-linux, ia64-netbsd, etc. 4. Is there a specification of the PCREL60B reloc available somewhere ? Cheers Nick
For some reason my responses to Nick's questions bounced. Documentation on ia64 relocs is here: https://refspecs.linuxfoundation.org/elf/IA64-SysV-psABI.pdf On 2020-04-08 8:57 a.m., nickc at redhat dot com wrote: > I am afraid that I am totally unfamiliar with the IA64 ABI, so if my > questions below do not make sense, then please forgive me: > > 1. Is it the case that all instances of the PCREL60B reloc are wrong > or is it only under certain circumstances ? If it is under > certain circumstances, then what are they ? I believe the issue applies to all instances. The reloc just isn't tested. > > 2. Are other, similar relocs affected ? I believe the PCREL21B works properly. > > 3. Does this problem only apply to the HP-UX targeted version of the > IA64 assembler, or does it apply to other versions too ? eg > ia64-linux, ia64-netbsd, etc. HP-UX ld handles calls to "weak" functions differently from GNU ld. The big difference is calls to "weak" functions don't go through the PLT. Currently, calls to weak functions use the PCREL21B relocation. When gcc switched to building with g++, this resulted in the gcc build using weak functions. There's no garbage collection in the HP linker (i.e., linkonce support), so some calls ended up exceeding the maximum branch distance of the PCREL21B relocation. This is a particular problem when stage1 is built with -O0. The cc1 and cc1plus binaries are huge. Given the above, the simplest solution seemed to be to add a long call feature using the PCREL60B relocation. However, Peter found that the GNU tools don't appear to handle the relocation correctly.
Hi Nick, Hi John, I apologize for a very delayed response gentlemen. But I completely agree with what John has said in his previous comment. Just to expand on his answer for point (3), from my current understanding, I think in general this issue applies to all IA64 based systems. However because on the other systems as John has mentioned, linkers in use most likely support garbage collection, the PCREL60B issue has not been triggered as it really seems to apply to corner cases when we have fairly large binaries to deal with. Unfortunately due to severe lack of time right now, I did not have a chance to see if I can provide a proof of concept patch for this issue for now. However if you have one that I can test, I will be able to do so fairly quickly as I have access to IA HPUX box right now. Thanks! --peter
Here's the solution I used to fix the PCREL60B offset for HP: --- binutils-2.36/gas/config/tc-ia64.c 2021-01-09 10:47:33.000000000 +0000 +++ binutils-2.36-snake/gas/config/tc-ia64.c 2021-05-17 10:21:40.651307362 +0100 @@ -6892,7 +6892,13 @@ for (j = 0; j < md.slot[curr].num_fixups; ++j) { ifix = md.slot[curr].fixup + j; - fix = fix_new_exp (frag_now, frag_now_fix () - 16 + i, 8, + + unsigned long where = frag_now_fix () - 16 + i; +#ifdef TE_HPUX + if ( ifix->code == BFD_RELOC_IA64_PCREL60B ) where++; +#endif + + fix = fix_new_exp (frag_now, where, 8, &ifix->expr, ifix->is_pcrel, ifix->code); fix->tc_fix_data.opnd = ifix->opnd; fix->fx_file = md.slot[curr].src_file; I've made the change HP specific, as I'm not sure the binutils linker treats the offset the same way, and I've no way to test. I've tested this and it works for brl instructions in all the cases I've seen (including a full bootstrap of gcc and a separate large project build). I know this platform is obsoleted in 2.36, but as it's the only way to get a working modern gcc version, it would be really helpful if this change or something similar could be accepted. --- John
Hi John, Please send change to binutils list with install request. CC Jim Wilson and myself. Jim is the expert on ia64 assembly code. There are some GNU style issues with the change as written. The declaration of "where" should be at the start of the block. There should be no space after "(" or before ")". "where++" should be on following line. Check white space. The Debian ia64 Linux port is still active, so I don't think ia64 should be obsoleted at this time. After the binutils change is accepted, please submit gcc changes to the gcc patches list. Please test change on master if possible. On 2021-05-17 6:07 a.m., jvb at cyberscience dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=25599 > > John Buddery <jvb at cyberscience dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |jvb at cyberscience dot com > > --- Comment #4 from John Buddery <jvb at cyberscience dot com> --- > Here's the solution I used to fix the PCREL60B offset for HP: > > --- binutils-2.36/gas/config/tc-ia64.c 2021-01-09 10:47:33.000000000 +0000 > +++ binutils-2.36-snake/gas/config/tc-ia64.c 2021-05-17 10:21:40.651307362 > +0100 > @@ -6892,7 +6892,13 @@ > for (j = 0; j < md.slot[curr].num_fixups; ++j) > { > ifix = md.slot[curr].fixup + j; > - fix = fix_new_exp (frag_now, frag_now_fix () - 16 + i, 8, > + > + unsigned long where = frag_now_fix () - 16 + i; > +#ifdef TE_HPUX > + if ( ifix->code == BFD_RELOC_IA64_PCREL60B ) where++; > +#endif > + > + fix = fix_new_exp (frag_now, where, 8, > &ifix->expr, ifix->is_pcrel, ifix->code); > fix->tc_fix_data.opnd = ifix->opnd; > fix->fx_file = md.slot[curr].src_file; >
The master branch has been updated by John David Anglin <danglin@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=ee22a1a31d3432bb40b67229f75ae7c2a271e38e commit ee22a1a31d3432bb40b67229f75ae7c2a271e38e Author: John David Anglin <danglin@gcc.gnu.org> Date: Wed May 19 15:27:28 2021 +0000 Fix offset for ia64 PCREL60B relocation on HP-UX gas/ChangeLog: 2021-05-19 John Buddery <jvb@cyberscience.com> PR 25599 * config/tc-ia64.c (emit_one_bundle): Increment fixup offset by one for PCREL60B relocation on HP-UX.
Patch applied to master.
This patch caused a failure in the gas testsuite. ia64-hpux +FAIL: ia64 mlx reloc Investigating how this testsuite failure should be fixed, I took a look at the ia64 abi and found that the bottom two bits of r_offset for relocations applying to instructions encode a slot number. So the comment in the patch really ought to talk about slot numbers rather than "offset" to better explain what is going on. Furthermore, PCREL60B applies to slot 1 and slot 2, so it seems wrong to specify slot 2 for PCREL60B. If that really is needed the comment ought to say something about a workaround for an ia64-hpux linker bug, I think. Note that there are a number of relocations with form instruction-immediate64. Like the form instruction-immediate60 for PCREL60B those also apply to slot 1 and slot 2. It looks like gas emits a slot 1 encoded offset for those. Is that correct for ia64-hpux? Also, if I'm correct in my analysis, what happens when/if the ia64-hpux linker is fixed? I'm inclined to think the gas patch should not have been applied, the bugzilla subject line is wrong, and the reporter should have instead pursued an ia64-hpux linker bug fix.
Thanks very much for the analysis - I agree, this is about slot numbers not offsets and the comment is inaccurate. I too found the HP behaviour odd, when considering the instruction as using slots 1 and 2. Possibly the reasoning is that the actual instruction really occupies slot 2, with slot 1 containing 39 bits of the immediate. So, HP could argue that the instruction slot is 2, even though the relocation starts at slot 1. In any case, the HP tool chain generates the relocation that way. For this reason, HP would effectively be unable to make a linker change without breaking the existing library codebase. It's also vanishingly unlikely they will make further changes as the OS is end of life in 2025. The binutils linker does not support HP, otherwise of course the preferred solution would be to use that So, the only possibility to get a working gas is compatibility with the HP linker, and I would ask you to please consider keeping this change - with an improved comment to say this is a workaround for the HP linker using slot 2 as the target. As far as I can tell, it is only the immediate-60 relocation that has problems on HP. I have not seen issues with any of the immediate-64 relocations, at least not in any of the instructions gcc generates.
I think the test is wrong. The brl instruction is comprised of two instruction slots (L+X). For brl, the imm39 field and a 2-bit Ignored field occupy the L instruction slot. The actual branch instruction is in the second (X) slot. The PCREL relocation needs to be computed relative to this slot, so it makes sense that the relocation would be specified relative to this slot. The psABI says "P" is the address of the bundle containing the instruction. However, the brl instruction was introduced in itanium2 and it uses two instruction slots. I think the other L+X instructions (movl, break.x and nop.x) need checking with the PCREL60B and other relocations involving instruction placement. John, could you look at this? This is failing test: .text .proc foo# foo: .mlx mov r25 = r0 brl.call.sptk.many b0 = bar# .endp foo# I agree comment in the change needs an update. Even if the HP linker is wrong, it's important that we get the brl instruction working on HPUX so that current versions of gcc will build.
I will test movl, break.x and nop.x. They have operands of 64, 62 and 62 bits, with very different layout to brl, so PCREL60 won't apply. One potential candidate is PCREL64 against a movl, I'll see if I can generate and test this and anything else I can find (it will be next week though). I see your point about the test - we could be checking for PCREL relocations for an L+X instruction class, rather than PCREL60 relocations. Although the P calculation is bundle based, the r_offset uses 2 bits for the slot, as stated above. It's the r_offset we're adjusting. The info I've been using is on pages 4-6 to 4-8 of the relocation reference you provided a link for. Specifying a slot value for a 2 slot instruction + immediate is clearly ambiguous without further clarification. HP interprets it one way, other platforms differently. I'd say neither is really right or wrong, it's unfortunate that there are differences but we're stuck with it now.
On 2021-05-21 12:15 p.m., jvb at cyberscience dot com wrote: > Specifying a slot value for a 2 slot instruction + immediate is clearly > ambiguous without further clarification. HP interprets it one way, other > platforms differently. I'd say neither is really right or wrong, it's > unfortunate that there are differences but we're stuck with it now. I agree. On page 3:294 of the architecture, I see that the .mlx template has the X unit instruction in slot 2. However, the IP value used with the immediate is the address of the bundle which contains the current executing instruction (page 1:27). So, the immediate value doesn't depend on slot. But I still think it makes sense to apply the relocation to the X unit instruction as it determines the X3, X4 immediate encoding.
On 2021-05-21 12:15 p.m., jvb at cyberscience dot com wrote: > I will test movl, break.x and nop.x. They have operands of 64, 62 and 62 bits, > with very different layout to brl, so PCREL60 won't apply. One potential > candidate is PCREL64 against a movl, I'll see if I can generate and test this > and anything else I can find (it will be next week though). Could you test the testcase with HP assembler and compare with GAS using objdump -r?
The master branch has been updated by Alan Modra <amodra@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d71893802fe424e3123ced8c6ed158958487f716 commit d71893802fe424e3123ced8c6ed158958487f716 Author: Alan Modra <amodra@gmail.com> Date: Sat May 22 12:59:36 2021 +0930 Re: Fix offset for ia64 PCREL60B relocation on HP-UX PR 25599 * config/tc-ia64.c (emit_one_bundle): Expand comment for HP-UX adjustment. Add assertion. * testsuite/gas/ia64/reloc-mlx.d: Pass when slot 2 specified for PCREL60B.
Thanks for delving into this, as you can see I've updated the comment and made the testcase accept either slot 1 or 2 for PCREL60B. Interestingly, GNU ld seems to accept either slot 1 or 2 for PCREL60B. At least, the gas/testsuite/gas/ia64/reloc-mlx testcase object with r_offset edited seems to produce the same final linked output as the original, for various --defsym bar=0x<hex_number>.
Thanks for adjusting the comment. It seems the person that wrote the GNU ld code was aware of the ambiguity. I suspect there will be a need to implement the brl instruction in gcc for GNU Linux at some point, so its good to know that linking with it works.