New .nops directive, to aid Linux alternatives patching?
H.J. Lu
hjl.tools@gmail.com
Mon Feb 12 00:07:00 GMT 2018
On Sun, Feb 11, 2018 at 3:28 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sun, Feb 11, 2018 at 3:05 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Sun, Feb 11, 2018 at 10:58 AM, Andrew Cooper
>> <andrew.cooper3@citrix.com> wrote:
>>> On 11/02/2018 17:19, H.J. Lu wrote:
>>>> On Sun, Feb 11, 2018 at 8:45 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Sun, Feb 11, 2018 at 8:25 AM, Andrew Cooper
>>>>> <andrew.cooper3@citrix.com> wrote:
>>>>>> On 11/02/2018 00:59, H.J. Lu wrote:
>>>>>>>>> Please try users/hjl/nop branch:
>>>>>>>>>
>>>>>>>>> https://github.com/hjl-tools/binutils-gdb/tree/users/hjl/nop
>>>>>>>> Oh - thankyou! I was about to ask if there were any pointers to get
>>>>>>>> started hacking on binutils.
>>>>>>>>
>>>>>>>> As for the functionality, there are unfortunately some issues. Given
>>>>>>>> this source:
>>>>>>>>
>>>>>>>> .text
>>>>>>>> single:
>>>>>>>> nop
>>>>>>>>
>>>>>>>> pseudo_1:
>>>>>>>> .nop 1
>>>>>>>>
>>>>>>>> pseudo_8:
>>>>>>>> .nop 8
>>>>>>>>
>>>>>>>> pseudo_8_4:
>>>>>>>> .nop 8, 4
>>>>>>>>
>>>>>>>> pseudo_20:
>>>>>>>> .nop 20
>>>>>>>>
>>>>>>>> I get the following disassembly:
>>>>>>>>
>>>>>>>> 0000000000000000 <single>:
>>>>>>>> 0: 90 nop
>>>>>>>>
>>>>>>>> 0000000000000001 <pseudo_1>:
>>>>>>>> 1: 66 90 xchg %ax,%ax
>>>>>>>>
>>>>>>>> 0000000000000003 <pseudo_8>:
>>>>>>>> 3: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
>>>>>>>> a: 00 00
>>>>>>>>
>>>>>>>> 000000000000000c <pseudo_8_4>:
>>>>>>>> c: 90 nop
>>>>>>>> d: 0f 1f 40 00 nopl 0x0(%rax)
>>>>>>>> 11: 0f 1f 40 00 nopl 0x0(%rax)
>>>>>>>>
>>>>>>>> 0000000000000015 <pseudo_20>:
>>>>>>>> 15: 90 nop
>>>>>>>> 16: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
>>>>>>>> 1d: 00 00 00
>>>>>>>> 20: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
>>>>>>>> 27: 00 00 00
>>>>>>>>
>>>>>>>> The MAX_NOP part looks to be working as intended (including reducing
>>>>>>>> below the default of 10), but there appears to be an off-by-one
>>>>>>>> somewhere, as one too many nops are emitted in the block.
>>>>>>>>
>>>>>>>> Furthermore, attempting to use .nop 30 yields:
>>>>>>>>
>>>>>>>> /tmp/ccI2Eakp.s: Assembler messages:
>>>>>>>> /tmp/ccI2Eakp.s: Fatal error: can't write 145268933551616 bytes to
>>>>>>>> section .text of nops.o: 'Bad value'
>>>>>>> Please try my branch again. It should be fixed.
>>>>>> Thanks. All of that looks to be in order.
>>>>>>
>>>>>> However, when trying to build larger examples, I've started hitting:
>>>>>>
>>>>>> /tmp/ccvxOy2v.s: Assembler messages:
>>>>>> /tmp/ccvxOy2v.s: Internal error in md_convert_frag at
>>>>>> ../../gas/config/tc-i386.c:9510.
>>>>>>
>>>>>> Which is the gas_assert (fragP->fr_var != BFD_RELOC_X86_NOP); you've added.
>>>>>>
>>>>>> It occurs when the calculation of the number of nops to insert evaluates
>>>>>> to 0, and a simple ".nop 0" managed to reproduce the issue. The
>>>>>> calculation evaluating to 0 is a side effect of the existing logic to
>>>>>> evaluate how much, if an, padding is required, and follows this kind of
>>>>>> pattern:
>>>>>>
>>>>> It should be fixed now. I also added 11-byte nop for 64-bit:
>>>>>
>>>>> 67 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%eax,%eax,1)
>>>>>
>>>> I implemented:
>>>>
>>>> .nop SIZE [, MAX_NOP]
>>>>
>>>> where the maximum size is 255 bytes. Should we go with
>>>>
>>>> .nop MAX_SIZE, SIZE [, MAX_NOP]
>>>>
>>>> to support more than 255 bytes?
>>>
>>> If you were to do that, why not simply remove the 255 maximum limit,
>>> rather than having a user pass two identical numbers? That said, I
>>> think the current implementation with 255 is probably fine; My example
>>> of ~45 is pushing it, but I expect that any example trying to use 64 or
>>> more almost certainly has a better way to do the same thing.
>>>
>>> As for your latest branch, I've found one very curious failure which I'm
>>> at a loss to explain. Its all building fine, except for one single
>>> RSB-stuffing alternative in VT-x vmexit handler. The alternative in
>>> question should be 0 +21 nops padding, optionally replaced with 21 bytes
>>> of actual RSB-stuffing, and several identical copies of this alternative
>>> elsewhere appear to be working correctly.
>>>
>>> Using your latest branch, when building using .skip, everything works
>>> correctly, but when building with .nop, the calculation believes that
>>> there are only 3 bytes of padding necessary, and trip the assertion that
>>> the replacement length is not longer than original length.
>>>
>>> At a guess, I'd say that something is suspect with the relocation
>>> calculations, but I have no idea what.
>>>
>>> I haven't managed to miniaturise the repro any further than this:
>>>
>>> Grab
>>> http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/alternatives-v1
>>> which is a branch cleaning up a load of our alternatives handling, and
>>> has support for .nop, and use the following build rune from the root of
>>> the tree:
>>>
>>> (cd xen/arch/x86/hvm/vmx;
>>> PATH=/local/bin/gcc-ret/bin:/local/bin/nops-binutils/bin:$PATH gcc
>>> -D__ASSEMBLY__ -m64 -DBUILD_ID -fno-strict-aliasing -Wall
>>> -Wstrict-prototypes -Wdeclaration-after-statement
>>> -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O1
>>> -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror
>>> -Wredundant-decls -Wno-pointer-arith -pipe -g -D__XEN__ -include
>>> ../../../../include/xen/config.h '-D__OBJECT_FILE__="entry.o"'
>>> -Wa,--strip-local-absolute -MMD -MF ./.entry.o.d -I../../../../include
>>> -I../../../../include/asm-x86/mach-generic
>>> -I../../../../include/asm-x86/mach-default -DXEN_IMG_OFFSET=0x200000
>>> '-D__OBJECT_LABEL__=arch$x86$hvm$vmx$entry.o' -msoft-float
>>> -fno-stack-protector -fno-exceptions -Wnested-externs -DHAVE_GAS_VMX
>>> -DHAVE_GAS_SSE4_2 -DHAVE_GAS_EPT -DHAVE_GAS_RDRAND -DHAVE_GAS_FSGSBASE
>>> -DHAVE_GAS_RDSEED -DHAVE_GAS_LONG_NOPS -U__OBJECT_LABEL__
>>> -DHAVE_GAS_QUOTED_SYM '-D__OBJECT_LABEL__=arch/x86/hvm/vmx/entry.o'
>>> -mno-red-zone -fpic -fno-asynchronous-unwind-tables -mno-sse
>>> -mskip-rax-setup -DGCC_HAS_VISIBILITY_ATTRIBUTE
>>> -mindirect-branch=thunk-extern -mindirect-branch-register
>>> -DCONFIG_INDIRECT_THUNK -Wa,-I../../../../include -c entry.S -o entry.o)
>>>
>>> vmx's entry.S is fairly small, and in this example, I happen to be using
>>> one of your repoline branch versions of from "gcc (GCC) 7.2.1
>>> 20171218". Substitute the PATH as appropriate, and the interesting bits
>>> of the ALTERNATIVE implementation are all in
>>> xen/include/asm-x86/alternative-asm.h
>>
>> Is this the error message you saw:
>>
>> gcc -D__ASSEMBLY__ -m64 -DBUILD_ID -fno-strict-aliasing -Wall
>> -Wstrict-prototypes -Wdeclaration-after-statement
>> -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O1
>> -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror
>> -Wredundant-decls -Wno-pointer-arith -pipe -g -D__XEN__ -include
>> /export/ssd/git/kernel.org/xen/xen/include/xen/config.h
>> '-D__OBJECT_FILE__="entry.o"' -Wa,--strip-local-absolute -MMD -MF
>> ./.entry.o.d -I/export/ssd/git/kernel.org/xen/xen/include
>> -I/export/ssd/git/kernel.org/xen/xen/include/asm-x86/mach-generic
>> -I/export/ssd/git/kernel.org/xen/xen/include/asm-x86/mach-default
>> -DXEN_IMG_OFFSET=0x200000
>> '-D__OBJECT_LABEL__=arch$x86$hvm$vmx$entry.o' -msoft-float
>> -fno-stack-protector -fno-exceptions -Wnested-externs -DHAVE_GAS_VMX
>> -DHAVE_GAS_SSE4_2 -DHAVE_GAS_EPT -DHAVE_GAS_RDRAND -DHAVE_GAS_FSGSBASE
>> -DHAVE_GAS_RDSEED -DHAVE_GAS_LONG_NOPS -U__OBJECT_LABEL__
>> -DHAVE_GAS_QUOTED_SYM '-D__OBJECT_LABEL__=arch/x86/hvm/vmx/entry.o'
>> -mno-red-zone -fpic -fno-asynchronous-unwind-tables -mno-sse
>> -mskip-rax-setup -DGCC_HAS_VISIBILITY_ATTRIBUTE
>> -mindirect-branch=thunk-extern -mindirect-branch-register
>> -DCONFIG_INDIRECT_THUNK
>> -Wa,-I/export/ssd/git/kernel.org/xen/xen/include -c entry.S
>> entry.S: Assembler messages:
>> entry.S:41: Error: value of 292 too large for field of 1 byte at 1
>>
>>
>
> I need a small testcase to work on assembler. Please double
> check to verify that your change is correct.
>
Does it look a testcase?
.macro mknops nr_bytes
#ifdef NOP
.nop \nr_bytes, 9
#else
.skip \nr_bytes, 9
#endif
.endm
.L_orig_s:
.L_orig_e:
mknops (-(((.L_repl_e1 - .L_repl_s1) - (.L_orig_e - .L_orig_s)) >
0) * ((.L_repl_e1 - .L_repl_s1) - (.L_orig_e - .L_orig_s)))
.L_orig_p:
.byte 0xff + (.L_repl_e1 - .L_repl_s1) - (.L_orig_p - .L_orig_s)
.section .altinstr_replacement, "ax", @progbits
.L_repl_s1:
.L_fill_rsb_loop:
jnz .L_fill_rsb_loop
mov %rax, %rsp
.L_repl_e1:
[hjl@gnu-bdx-1 vmx]$ gcc -c y.S -DNOP
y.S: Assembler messages:
y.S:14: Error: value of 257 too large for field of 1 byte at 3
[hjl@gnu-bdx-1 vmx]$ gcc -c y.S
[hjl@gnu-bdx-1 vmx]$
--
H.J.
More information about the Binutils
mailing list