This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: PATCH: Optimize IA64 brl to br

From: "H. J. Lu" <hjl at lucon dot org>
To: Jim Wilson <wilson at specifixinc dot com>
Cc: Richard Henderson <rth at redhat dot com>,Zack Weinberg <zack at codesourcery dot com>, binutils at sources dot redhat dot com
Date: Fri, 30 Jan 2004 11:45:12 -0800
Subject: Re: PATCH: Optimize IA64 brl to br
References: <20040129000632.GA14782@lucon.org> <87isiv7gss.fsf@egil.codesourcery.com> <20040129170735.GA31422@lucon.org> <1075428976.1011.82.camel@leaf.tuliptree.org> <20040130030933.GA12268@redhat.com> <1075443815.1231.99.camel@leaf.tuliptree.org>

On Thu, Jan 29, 2004 at 10:23:35PM -0800, Jim Wilson wrote:
> On Thu, 2004-01-29 at 19:09, Richard Henderson wrote:
> > Is MBB really the best bundle to use for this?  I seem to recall
> > that forcing a split issue.  I'd think MIB or MMB would be better.
> 
> Good point.  I didn't check that.
> 
> See section 3.3.2 "Dispersal Rules" of the Itanium2 Processor Reference
> Manual, in particular see table 3-4 on page 16 which shows that MBB
> always splits issue.  However, the same section points out that MIB,
> MMB, and MFB always split issue unless the B insn is a nop.b or a brp
> (branch predict).  So it does not appear to make any difference.
> 
> The table 3-4 has some typos incidentally.  I have the June 2002 version
> of the document.  MBB is mentioned twice, both in the row and column
> indices.  The second one is supposed to be MMB in both cases.  Also, the
> MMB and MFB row entries are missing the 1 superscript that is present on
> the MIB row entry, to indicate that dual-issue only occurs if the B is a
> nop.b or brp.
> 
> I don't see anything that would suggest any of these is better than the
> other.

This is the revised patch. Is it OK?


H.J.
-----
2004-01-30  H.J. Lu  <hongjiu.lu@intel.com>

	* elfxx-ia64.c (elfNN_ia64_relax_section): Optimize brl to br
	during the relax finalize pass.

--- bfd/elfxx-ia64.c.brl	2004-01-28 14:54:28.000000000 -0800
+++ bfd/elfxx-ia64.c	2004-01-30 11:37:22.000000000 -0800
@@ -775,13 +775,30 @@ elfNN_ia64_relax_section (abfd, sec, lin
 	case R_IA64_PCREL21BI:
 	case R_IA64_PCREL21M:
 	case R_IA64_PCREL21F:
+	  /* In the finalize pass, all br relaxations are done. We can
+	     skip it. */
 	  if (!link_info->need_relax_finalize)
 	    continue;
 	  is_branch = TRUE;
 	  break;
 
+	case R_IA64_PCREL60B:
+	  /* We can't optimize brl to br before the finalize pass since
+	     br relaxations will increase the code size. Defer it to
+	     the finalize pass.  */
+	  if (link_info->need_relax_finalize)
+	    {
+	      sec->need_finalize_relax = 1;
+	      continue;
+	    }
+	  is_branch = TRUE;
+	  break;
+
 	case R_IA64_LTOFF22X:
 	case R_IA64_LDXMOV:
+	  /* We can't relax ldx/mov before the finalize pass since
+	     br relaxations will increase the code size. Defer it to
+	     the finalize pass.  */
 	  if (link_info->need_relax_finalize)
 	    {
 	      sec->need_finalize_relax = 1;
@@ -895,6 +912,51 @@ elfNN_ia64_relax_section (abfd, sec, lin
 	  /* If the branch is in range, no need to do anything.  */
 	  if ((bfd_signed_vma) (symaddr - reladdr) >= -0x1000000
 	      && (bfd_signed_vma) (symaddr - reladdr) <= 0x0FFFFF0)
+	    {
+	      /* If the 60-bit branch is in 21-bit range, optimize it. */
+	      if (r_type == R_IA64_PCREL60B)
+		{
+		  int template;
+		  bfd_byte *hit_addr;
+		  bfd_vma t0, t1, i0, i1, i2;
+
+		  hit_addr = (bfd_byte *) (contents + roff);
+		  hit_addr -= (long) hit_addr & 0x3;
+		  t0 = bfd_get_64 (abfd, hit_addr);
+		  t1 = bfd_get_64 (abfd, hit_addr + 8);
+
+		  /* Keep the instruction in slot 0. */
+		  i0 = (t0 >> 5) & 0x1ffffffffffLL;
+		  /* Use nop.b for slot 1. */
+		  i1 = 0x4000000000LL;
+		  /* For slot 2, turn brl into br by masking out bit
+		     40.  */
+		  i2 = (t1 >> 23) & 0x0ffffffffffLL;
+
+		  /* Turn a MLX bundle into a MBB bundle with the
+		     same stop-bit variety.  */
+		  template = 0x12;
+		  if ((t0 & 0x1fLL) == 5)
+		    template += 1;
+		  t0 = (i1 << 46) | (i0 << 5) | template;
+		  t1 = (i2 << 23) | (i1 >> 18);
+
+		  bfd_put_64 (abfd, t0, hit_addr);
+		  bfd_put_64 (abfd, t1, hit_addr + 8);
+
+		  irel->r_info
+		    = ELF64_R_INFO (ELF64_R_SYM (irel->r_info),
+				    R_IA64_PCREL21B);
+
+		  /* If the original relocation offset points to slot
+		     1, change it to slot 2.  */
+		  if ((irel->r_offset & 3) == 1)
+		    irel->r_offset += 1;
+		}
+
+	      continue;
+	    }
+	  else if (r_type == R_IA64_PCREL60B)
 	    continue;
 
 	  /* If the branch and target are in the same section, you've

Follow-Ups:
- Re: PATCH: Optimize IA64 brl to br
  - From: Jim Wilson

References:
- PATCH: Optimize IA64 brl to br
  - From: H. J. Lu
- Re: PATCH: Optimize IA64 brl to br
  - From: Zack Weinberg
- Re: PATCH: Optimize IA64 brl to br
  - From: H. J. Lu
- Re: PATCH: Optimize IA64 brl to br
  - From: Jim Wilson
- Re: PATCH: Optimize IA64 brl to br
  - From: Richard Henderson
- Re: PATCH: Optimize IA64 brl to br
  - From: Jim Wilson

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]