Bug 24642 - Gold missing armv8 from may_use_v5t_interworking() results in less efficient v4t stubs
Summary: Gold missing armv8 from may_use_v5t_interworking() results in less efficient ...
Status: UNCONFIRMED
Alias: None
Product: binutils
Classification: Unclassified
Component: gold (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Cary Coutant
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-06-06 13:38 UTC by Peter Smith
Modified: 2019-06-06 13:38 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Smith 2019-06-06 13:38:53 UTC
In arm.cc the code that permits v5t interworking instructions is missing a case for TAG_CPU_ARCH_V8 and the various V8_M mainline and baseline. As --fix-arm1176 is the default this means that gold is using v4t stubs for arm v8 cpus. This is not a problem for correctness but v4t stubs and the non-use of BLX is likely to have a measurable code-size and performance impact.

This can be worked around with -fno-fix-arm1176 but it should be simple to fix.

  // Whether we have v5T interworking instructions available.
  bool
  may_use_v5t_interworking() const
  {
    Object_attribute* attr =
      this->get_aeabi_object_attribute(elfcpp::Tag_CPU_arch);
    int arch = attr->int_value();
    if (parameters->options().fix_arm1176())
      return (arch == elfcpp::TAG_CPU_ARCH_V6T2
	      || arch == elfcpp::TAG_CPU_ARCH_V7
	      || arch == elfcpp::TAG_CPU_ARCH_V6_M
	      || arch == elfcpp::TAG_CPU_ARCH_V6S_M
	      || arch == elfcpp::TAG_CPU_ARCH_V7E_M);
    else
      return (arch != elfcpp::TAG_CPU_ARCH_PRE_V4
	      && arch != elfcpp::TAG_CPU_ARCH_V4
	      && arch != elfcpp::TAG_CPU_ARCH_V4T);
  }


A reproducer:
// call.c
extern void func1(void);

void func2(void) {
    func1();
}
arm-linux-gnueabihf-gcc call.c -mthumb -o call.so --shared -fuse-ld=gold -march=armv8-a -fpic

arm-linux-gnueabihf-objdump -d --no-show-raw-insn call.so
000004c0 <func2>:
 4c0:	push	{r7, lr}
 4c2:	add	r7, sp, #0
 4c4:	bl	4cc <func2+0xc>
 4c8:	nop
 4ca:	pop	{r7, pc}
 4cc:	bx	pc
 4ce:	nop			; (mov r8, r8)
 4d0:	stmia	r0!, {}
 4d2:	b.n	14 <_init-0x344>
 4d4:	blx	40f5f0 <_end+0x40d5c3>
 4d8:	mrc2	15, 5, pc, cr4, cr15, {7}

You can see the "bx pc" followed by "nop" of the v4t stub. 

The stmia is Arm instructions disassembled as Thumb because there is no mapping symbol to inform the disassembler that the state has changed.