This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: Re: Help needed to track down bug: linking Linux kernel with gold creates unbootable kernel
On 01/-10/-28163 09:59 PM, John Reiser wrote:
Not identical, difference starts at byte 230:
-vmlinux.gold: file format elf64-x86-64
+vmlinux.bfd: file format elf64-x86-64
...
It appears to be a difference in the address chosen for that global (and
other globals later on).
Although the placement of globals chosen from a library need not be
identical, it would be comforting to verify that this is the only reason.
Try changing the link command to remove all *.a (extract and specify each
*.o explicitly). Then there should be no difference.
There still is. The size of the .rodata is different, so probably the
order is still different too.
Does gold do some optimizations that bfd ld doesn't do? (such as
dropping unneeded globals, reordering the globals to not waste space due
to alignment, if it can put another global inbetween, etc.)
This is the commandline I used (using /usr/bin/ld vs /usr/local/bin/ld):
/usr/bin/ld --build-id -m elf_x86_64 -o vmlinux.bfd -T
arch/x86/kernel/vmlinux.lds arch/x86/kernel/head_64.o
arch/x86/kernel/head64.o arch/x86/kernel/head.o
arch/x86/kernel/init_task.o init/built-in.o --start-group
usr/built-in.o arch/x86/built-in.o kernel/built-in.o mm/built-in.o
fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o
block/built-in.o as/*.o lib/built-in.o arch/x86/lib/built-in.o
drivers/built-in.o sound/built-in.o firmware/built-in.o
arch/x86/pci/built-in.o arch/x86/power/built-in.o
arch/x86/video/built-in.o net/built-in.o --end-group .tmp_kallsyms2.o
I still have differences:
-ffffffff810000e1: 48 01 2d 08 c0 46 00 add
%rbp,0x46c008(%rip) # ffffffff8146c0f0 <trampoline_level4_pgt>
-ffffffff810000e8: 48 01 2d f9 cf 46 00 add
%rbp,0x46cff9(%rip) # ffffffff8146d0e8 <trampoline_level4_pgt+0xff8>
+ffffffff810000e1: 48 01 2d 98 74 40 00 add
%rbp,0x407498(%rip) # ffffffff81407580 <trampoline_level4_pgt>
+ffffffff810000e8: 48 01 2d 89 84 40 00 add
%rbp,0x408489(%rip) # ffffffff81408578 <trampoline_level4_pgt+0xff8>
So I did this (the .s is obtained by objdump -d vmlinux.gold >gold.s)
sed -re 's/(# |0x)[a-z0-9]+/HEX/g' gold.s | colrm 1 47 >gold1.s
And diff those.
Then aside from some local symbol name differences:
- cmp HEX(%rip),%edx HEX <.LC3>
+ cmp HEX(%rip),%edx HEX <kallsyms_token_index+HEX>
I have this diff (+ is bfd), which is coming from .notes (why does
objdump think it needs to dump .notes as assembly though?):
+ add $HEX,%al
+ add %al,(%rax)
+ adc $HEX,%al
+ add %al,(%rax)
+ add (%rax),%eax
+ add %al,(%rax)
+ rex.RXB
+ rex.WRX push %rbp
+ add %dh,%bh
+ insb (%dx),%es:(%rdi)
+ jle ffffffff813d1250 <bad_to_user+HEX>
+ and $HEX,%dl
+ (bad)
+ jge ffffffff813d1331 <__start___ex_table+HEX>
+ cs
+ callq ffffffff2bbe8fc1 <__crc___pskb_pull_tail+HEX>
+ stc
+ mov $HEX,%ch
+ (bad)
... there is also difference in padding:
gold uses 00 00 90 90 (add %al, (%rax) nop nop), while BFD uses 90 90 90
90 (4 nops).
That is a dispute over interpretation of the linker script:
} :text=0x9090
The original spec was from the days when 2==sizeof(int), so padding was
a 16-bit value, thus 0x9090 was all that mattered. Check the spec
for an update regarding width of padding. In the meantime, try changing
the script to
} :text=0x90909090
which should remove this source of differences.
Yes that removes the differences from the nops.
If I read that correctly it means it uses hardware pages with a pagesize
of 2MB for kernel text.
Yes.
Since gold aligns only to 0x1000 perhaps the rodata ends up in the same
hardware page as the .text.
I think these are the relevant align commands from the vmlinux.lds ...
. = ALIGN((1 << 21));
It is a bug that gold does not propagate that alignment constraint
to the .p_align.
If the hw pagesize is 2MB, then its not divisible, so its a bug.
Should I open a bugreport, or are there some patches to gold that I
could try?
Definitely open a bug report about ". = ALIGN((1 << 21));"
Opened bug 11490.
I think the .note difference is just due to gold embedding its version:
-Note section [ 2] '.notes' of 60 bytes at offset 0x3d2c58:
+Note section [ 2] '.notes' of 36 bytes at offset 0x5d1c58:
Owner Data size Type
- GNU 8 GNU_GOLD_VERSION
- Linker version: gold 1.9
GNU 20 GNU_BUILD_ID
- Build ID: a865af685f5222cdc17a28ea4e49d58b2185bc05
+ Build ID: 07b53da4e169ad1079080043ad72384fb80d0ea3
Again, it would be comforting to make a test run with GNU_GOLD_VERSION
omitted, to see if the .text becomes identical (except for Build ID)
with ld.
I did that (by editing gold source and returning from
create_gold_note()), but as I've shown above there are still diffs due
to global addresses...
Best regards,
--Edwin