New .nops directive, to aid Linux alternatives patching?

Andrew Cooper andrew.cooper3@citrix.com
Sun Feb 11 18:58:00 GMT 2018


On 11/02/2018 17:19, H.J. Lu wrote:
> On Sun, Feb 11, 2018 at 8:45 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Sun, Feb 11, 2018 at 8:25 AM, Andrew Cooper
>> <andrew.cooper3@citrix.com> wrote:
>>> On 11/02/2018 00:59, H.J. Lu wrote:
>>>>>> Please try users/hjl/nop branch:
>>>>>>
>>>>>> https://github.com/hjl-tools/binutils-gdb/tree/users/hjl/nop
>>>>> Oh - thankyou!  I was about to ask if there were any pointers to get
>>>>> started hacking on binutils.
>>>>>
>>>>> As for the functionality, there are unfortunately some issues.  Given
>>>>> this source:
>>>>>
>>>>>         .text
>>>>> single:
>>>>>         nop
>>>>>
>>>>> pseudo_1:
>>>>>         .nop 1
>>>>>
>>>>> pseudo_8:
>>>>>         .nop 8
>>>>>
>>>>> pseudo_8_4:
>>>>>         .nop 8, 4
>>>>>
>>>>> pseudo_20:
>>>>>         .nop 20
>>>>>
>>>>> I get the following disassembly:
>>>>>
>>>>> 0000000000000000 <single>:
>>>>>    0:    90                       nop
>>>>>
>>>>> 0000000000000001 <pseudo_1>:
>>>>>    1:    66 90                    xchg   %ax,%ax
>>>>>
>>>>> 0000000000000003 <pseudo_8>:
>>>>>    3:    66 0f 1f 84 00 00 00     nopw   0x0(%rax,%rax,1)
>>>>>    a:    00 00
>>>>>
>>>>> 000000000000000c <pseudo_8_4>:
>>>>>    c:    90                       nop
>>>>>    d:    0f 1f 40 00              nopl   0x0(%rax)
>>>>>   11:    0f 1f 40 00              nopl   0x0(%rax)
>>>>>
>>>>> 0000000000000015 <pseudo_20>:
>>>>>   15:    90                       nop
>>>>>   16:    66 2e 0f 1f 84 00 00     nopw   %cs:0x0(%rax,%rax,1)
>>>>>   1d:    00 00 00
>>>>>   20:    66 2e 0f 1f 84 00 00     nopw   %cs:0x0(%rax,%rax,1)
>>>>>   27:    00 00 00
>>>>>
>>>>> The MAX_NOP part looks to be working as intended (including reducing
>>>>> below the default of 10), but there appears to be an off-by-one
>>>>> somewhere, as one too many nops are emitted in the block.
>>>>>
>>>>> Furthermore, attempting to use .nop 30 yields:
>>>>>
>>>>> /tmp/ccI2Eakp.s: Assembler messages:
>>>>> /tmp/ccI2Eakp.s: Fatal error: can't write 145268933551616 bytes to
>>>>> section .text of nops.o: 'Bad value'
>>>> Please try my branch again.  It should be fixed.
>>> Thanks.  All of that looks to be in order.
>>>
>>> However, when trying to build larger examples, I've started hitting:
>>>
>>> /tmp/ccvxOy2v.s: Assembler messages:
>>> /tmp/ccvxOy2v.s: Internal error in md_convert_frag at
>>> ../../gas/config/tc-i386.c:9510.
>>>
>>> Which is the gas_assert (fragP->fr_var != BFD_RELOC_X86_NOP); you've added.
>>>
>>> It occurs when the calculation of the number of nops to insert evaluates
>>> to 0, and a simple ".nop 0" managed to reproduce the issue.  The
>>> calculation evaluating to 0 is a side effect of the existing logic to
>>> evaluate how much, if an, padding is required, and follows this kind of
>>> pattern:
>>>
>> It should be fixed now.  I also added 11-byte nop for 64-bit:
>>
>> 67 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%eax,%eax,1)
>>
> I implemented:
>
> .nop SIZE [, MAX_NOP]
>
> where the maximum size is 255 bytes.  Should we go with
>
> .nop MAX_SIZE, SIZE [, MAX_NOP]
>
> to support more than 255 bytes?

If you were to do that, why not simply remove the 255 maximum limit,
rather than having a user pass two identical numbers?  That said, I
think the current implementation with 255 is probably fine; My example
of ~45 is pushing it, but I expect that any example trying to use 64 or
more almost certainly has a better way to do the same thing.

As for your latest branch, I've found one very curious failure which I'm
at a loss to explain.  Its all building fine, except for one single
RSB-stuffing alternative in VT-x vmexit handler.  The alternative in
question should be 0 +21 nops padding, optionally replaced with 21 bytes
of actual RSB-stuffing, and several identical copies of this alternative
elsewhere appear to be working correctly.

Using your latest branch, when building using .skip, everything works
correctly, but when building with .nop, the calculation believes that
there are only 3 bytes of padding necessary, and trip the assertion that
the replacement length is not longer than original length.

At a guess, I'd say that something is suspect with the relocation
calculations, but I have no idea what.

I haven't managed to miniaturise the repro any further than this:

Grab
http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/alternatives-v1
which is a branch cleaning up a load of our alternatives handling, and
has support for .nop, and use the following build rune from the root of
the tree:

(cd xen/arch/x86/hvm/vmx;
PATH=/local/bin/gcc-ret/bin:/local/bin/nops-binutils/bin:$PATH gcc
-D__ASSEMBLY__ -m64 -DBUILD_ID -fno-strict-aliasing -Wall
-Wstrict-prototypes -Wdeclaration-after-statement
-Wno-unused-but-set-variable -Wno-unused-local-typedefs -O1
-fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror
-Wredundant-decls -Wno-pointer-arith -pipe -g -D__XEN__ -include
../../../../include/xen/config.h '-D__OBJECT_FILE__="entry.o"'
-Wa,--strip-local-absolute -MMD -MF ./.entry.o.d -I../../../../include
-I../../../../include/asm-x86/mach-generic
-I../../../../include/asm-x86/mach-default -DXEN_IMG_OFFSET=0x200000
'-D__OBJECT_LABEL__=arch$x86$hvm$vmx$entry.o' -msoft-float
-fno-stack-protector -fno-exceptions -Wnested-externs -DHAVE_GAS_VMX
-DHAVE_GAS_SSE4_2 -DHAVE_GAS_EPT -DHAVE_GAS_RDRAND -DHAVE_GAS_FSGSBASE
-DHAVE_GAS_RDSEED -DHAVE_GAS_LONG_NOPS -U__OBJECT_LABEL__
-DHAVE_GAS_QUOTED_SYM '-D__OBJECT_LABEL__=arch/x86/hvm/vmx/entry.o'
-mno-red-zone -fpic -fno-asynchronous-unwind-tables -mno-sse
-mskip-rax-setup -DGCC_HAS_VISIBILITY_ATTRIBUTE
-mindirect-branch=thunk-extern -mindirect-branch-register
-DCONFIG_INDIRECT_THUNK -Wa,-I../../../../include -c entry.S -o entry.o)

vmx's entry.S is fairly small, and in this example, I happen to be using
one of your repoline branch versions of from "gcc (GCC) 7.2.1
20171218".  Substitute the PATH as appropriate, and the interesting bits
of the ALTERNATIVE implementation are all in
xen/include/asm-x86/alternative-asm.h

Sorry I can't debug this further, but if you've got pointers for
debugging it further, I will see what I can do.

~Andrew



More information about the Binutils mailing list