Bug 12049 - Unnecessary relaxation
Summary: Unnecessary relaxation
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: 2.21
: P2 normal
Target Milestone: ---
Assignee: Alan Modra
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-23 16:48 UTC by Rafael Ávila de Espíndola
Modified: 2010-10-25 09:59 UTC (History)
3 users (show)

See Also:
Host: x86_64-linux-gnu
Target: x86_64-linux-gnu
Build: x86_64-linux-gnu
Last reconfirmed:


Attachments
test (71 bytes, application/octet-stream)
2010-10-01 14:57 UTC, Rafael Ávila de Espíndola
Details
test (74 bytes, application/octet-stream)
2010-10-01 14:57 UTC, Rafael Ávila de Espíndola
Details
New test that still fails. (75 bytes, text/plain)
2010-10-22 18:42 UTC, Rafael Ávila de Espíndola
Details
The new testcase. (240 bytes, text/plain)
2010-10-22 19:24 UTC, Rafael Ávila de Espíndola
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rafael Ávila de Espíndola 2010-09-23 16:48:51 UTC
Gas will relax both jumps in

        .fill 14, 1, 0x90
	jmp	.LBB1_23
        .fill 14, 1, 0x90
	jmp	.LBB1_23
        .fill 18, 1, 0x90
	.align	16, 0x90
        .fill 96, 1, 0x90
.LBB1_23:

But the second relaxation is not necessary. The final offset of the second one
will be 125.
Comment 1 H.J. Lu 2010-10-01 05:30:19 UTC
Please provide a complete testcase.  The final offset
depends on the address of the second jump. I can't tell
what its address is.
Comment 2 Rafael Ávila de Espíndola 2010-10-01 14:56:47 UTC
It is a complete test. I will attach two .s files that show that the problems is
likely in not correctly updating the size estimates. Run:

as --64 test.s -o test.o
as --64 test2.s -o test2.o
objdump  -d test.o > test.dump
objdump  -d test2.o > test2.dump

And compare the dumps.
Note that the addresses align, but in the first case we produce a 5 byte jmp in
the second one we produce a 2 bytes one.
Comment 3 Rafael Ávila de Espíndola 2010-10-01 14:57:12 UTC
Created attachment 5024 [details]
test
Comment 4 Rafael Ávila de Espíndola 2010-10-01 14:57:30 UTC
Created attachment 5025 [details]
test
Comment 5 Rafael Ávila de Espíndola 2010-10-01 15:39:02 UTC
A slightly more interesting testcase:


        .fill 56, 1, 0x90
	jne	.LBB0_43
        .fill 10, 1, 0x90
	jne	.LBB0_43
        .fill 5, 1, 0x90
	.align	16, 0x90
        .fill 118, 1, 0x90
.LBB0_43:


gas is producing two 6 bytes jne. Chang the first one with a .fill 6,1,0x90 and
the second one becomes a 2 bytes jne
Comment 6 H.J. Lu 2010-10-14 13:37:11 UTC
This is due to relaxation between alignment directives and branch
instructions. There is no easy way to fix all cases.  I checked in a
patch to support .d32 suffix to force 32bit displacement. You can do

[hjl@gnu-6 pr12049]$ cat test3.s
        .fill 56, 1, 0x90
	jne.d32 .LBB0_43
        .fill 10, 1, 0x90
	jne .LBB0_43
        .fill 5, 1, 0x90
	.align 16, 0x90
        .fill 118, 1, 0x90
.LBB0_43:
[hjl@gnu-6 pr12049]$ ./as -o test3.o test3.s
[hjl@gnu-6 pr12049]$ objdump -dw test3.o

test3.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <.text>:
   0:	90                   	nop
   1:	90                   	nop
   2:	90                   	nop
   3:	90                   	nop
   4:	90                   	nop
   5:	90                   	nop
   6:	90                   	nop
   7:	90                   	nop
   8:	90                   	nop
   9:	90                   	nop
   a:	90                   	nop
   b:	90                   	nop
   c:	90                   	nop
   d:	90                   	nop
   e:	90                   	nop
   f:	90                   	nop
  10:	90                   	nop
  11:	90                   	nop
  12:	90                   	nop
  13:	90                   	nop
  14:	90                   	nop
  15:	90                   	nop
  16:	90                   	nop
  17:	90                   	nop
  18:	90                   	nop
  19:	90                   	nop
  1a:	90                   	nop
  1b:	90                   	nop
  1c:	90                   	nop
  1d:	90                   	nop
  1e:	90                   	nop
  1f:	90                   	nop
  20:	90                   	nop
  21:	90                   	nop
  22:	90                   	nop
  23:	90                   	nop
  24:	90                   	nop
  25:	90                   	nop
  26:	90                   	nop
  27:	90                   	nop
  28:	90                   	nop
  29:	90                   	nop
  2a:	90                   	nop
  2b:	90                   	nop
  2c:	90                   	nop
  2d:	90                   	nop
  2e:	90                   	nop
  2f:	90                   	nop
  30:	90                   	nop
  31:	90                   	nop
  32:	90                   	nop
  33:	90                   	nop
  34:	90                   	nop
  35:	90                   	nop
  36:	90                   	nop
  37:	90                   	nop
  38:	0f 85 88 00 00 00    	jne    0xc6
  3e:	90                   	nop
  3f:	90                   	nop
  40:	90                   	nop
  41:	90                   	nop
  42:	90                   	nop
  43:	90                   	nop
  44:	90                   	nop
  45:	90                   	nop
  46:	90                   	nop
  47:	90                   	nop
  48:	75 7c                	jne    0xc6
  4a:	90                   	nop
  4b:	90                   	nop
  4c:	90                   	nop
  4d:	90                   	nop
  4e:	90                   	nop
  4f:	90                   	nop
  50:	90                   	nop
  51:	90                   	nop
  52:	90                   	nop
  53:	90                   	nop
  54:	90                   	nop
  55:	90                   	nop
  56:	90                   	nop
  57:	90                   	nop
  58:	90                   	nop
  59:	90                   	nop
  5a:	90                   	nop
  5b:	90                   	nop
  5c:	90                   	nop
  5d:	90                   	nop
  5e:	90                   	nop
  5f:	90                   	nop
  60:	90                   	nop
  61:	90                   	nop
  62:	90                   	nop
  63:	90                   	nop
  64:	90                   	nop
  65:	90                   	nop
  66:	90                   	nop
  67:	90                   	nop
  68:	90                   	nop
  69:	90                   	nop
  6a:	90                   	nop
  6b:	90                   	nop
  6c:	90                   	nop
  6d:	90                   	nop
  6e:	90                   	nop
  6f:	90                   	nop
  70:	90                   	nop
  71:	90                   	nop
  72:	90                   	nop
  73:	90                   	nop
  74:	90                   	nop
  75:	90                   	nop
  76:	90                   	nop
  77:	90                   	nop
  78:	90                   	nop
  79:	90                   	nop
  7a:	90                   	nop
  7b:	90                   	nop
  7c:	90                   	nop
  7d:	90                   	nop
  7e:	90                   	nop
  7f:	90                   	nop
  80:	90                   	nop
  81:	90                   	nop
  82:	90                   	nop
  83:	90                   	nop
  84:	90                   	nop
  85:	90                   	nop
  86:	90                   	nop
  87:	90                   	nop
  88:	90                   	nop
  89:	90                   	nop
  8a:	90                   	nop
  8b:	90                   	nop
  8c:	90                   	nop
  8d:	90                   	nop
  8e:	90                   	nop
  8f:	90                   	nop
  90:	90                   	nop
  91:	90                   	nop
  92:	90                   	nop
  93:	90                   	nop
  94:	90                   	nop
  95:	90                   	nop
  96:	90                   	nop
  97:	90                   	nop
  98:	90                   	nop
  99:	90                   	nop
  9a:	90                   	nop
  9b:	90                   	nop
  9c:	90                   	nop
  9d:	90                   	nop
  9e:	90                   	nop
  9f:	90                   	nop
  a0:	90                   	nop
  a1:	90                   	nop
  a2:	90                   	nop
  a3:	90                   	nop
  a4:	90                   	nop
  a5:	90                   	nop
  a6:	90                   	nop
  a7:	90                   	nop
  a8:	90                   	nop
  a9:	90                   	nop
  aa:	90                   	nop
  ab:	90                   	nop
  ac:	90                   	nop
  ad:	90                   	nop
  ae:	90                   	nop
  af:	90                   	nop
  b0:	90                   	nop
  b1:	90                   	nop
  b2:	90                   	nop
  b3:	90                   	nop
  b4:	90                   	nop
  b5:	90                   	nop
  b6:	90                   	nop
  b7:	90                   	nop
  b8:	90                   	nop
  b9:	90                   	nop
  ba:	90                   	nop
  bb:	90                   	nop
  bc:	90                   	nop
  bd:	90                   	nop
  be:	90                   	nop
  bf:	90                   	nop
  c0:	90                   	nop
  c1:	90                   	nop
  c2:	90                   	nop
  c3:	90                   	nop
  c4:	90                   	nop
  c5:	90                   	nop
[hjl@gnu-6 pr12049]$
Comment 7 H.J. Lu 2010-10-16 22:29:41 UTC
A patch is posted at

http://sourceware.org/ml/binutils/2010-10/msg00253.html
Comment 9 Rafael Ávila de Espíndola 2010-10-22 18:42:02 UTC
The current binutils now gets the two previous tests right (thanks!), but it is still missing some cases. With the new testcase it encodes a je as

0f 84 02 00 00 00

That can be also encoded as

74 06
Comment 10 Rafael Ávila de Espíndola 2010-10-22 18:42:43 UTC
Created attachment 5079 [details]
New test that still fails.
Comment 11 Rafael Ávila de Espíndola 2010-10-22 18:43:41 UTC
Reopening as the new test still fails.
Comment 12 H.J. Lu 2010-10-22 19:13:15 UTC
(In reply to comment #10)
> Created attachment 5079 [details]
> New test that still fails.

This is identical to the testcase in comment 5. Did you
upload the wrong testcase?
Comment 13 Rafael Ávila de Espíndola 2010-10-22 19:24:18 UTC
Created attachment 5080 [details]
The new testcase.

Attached the wrong test previously.
Comment 14 H.J. Lu 2010-10-22 19:33:46 UTC
(In reply to comment #13)
> Created attachment 5080 [details]
> The new testcase.
> 
> Attached the wrong test previously.

Please tell me which branchs should have 8bit displacement.
Comment 15 Rafael Ávila de Espíndola 2010-10-22 19:37:20 UTC
The last "je" is the one that fits in 8 bits:

	je	.LBB1_17
	.align	16, 0x90
.LBB1_17:
Comment 16 H.J. Lu 2010-10-22 19:45:48 UTC
(In reply to comment #15)
> The last "je" is the one that fits in 8 bits:
> 
>     je    .LBB1_17
>     .align    16, 0x90
> .LBB1_17:

My fix at:

http://sourceware.org/ml/binutils/2010-10/msg00281.html

handles this case correctly :-).
Comment 17 Alan Modra 2010-10-23 11:58:22 UTC
Nice testcase!  The comment in my fix below explains how relax_frag went wrong.  I'll apply this later, probably tomorrow sometime.

Index: gas/write.c
===================================================================
RCS file: /cvs/src/src/gas/write.c,v
retrieving revision 1.134
diff -u -p -r1.134 write.c
--- gas/write.c	19 Oct 2010 12:14:23 -0000	1.134
+++ gas/write.c	23 Oct 2010 11:55:01 -0000
@@ -2163,6 +2163,13 @@ relax_frag (segT segment, fragS *fragP, 
 	  if (stretch < 0
 	      || sym_frag->region == fragP->region)
 	    target += stretch;
+	  /* If we get here we know we have a forward branch.  This
+	     relax pass may have stretched previous instructions so
+	     far that omitting STRETCH would make the branch
+	     negative.  Don't allow this in case the negative reach is
+	     large enough to require a larger branch instruction.  */
+	  else if (target < address)
+	    target = fragP->fr_next->fr_address + stretch;
 	}
     }
Comment 18 H.J. Lu 2010-10-25 09:59:51 UTC
Fixed.