New .nops directive, to aid Linux alternatives patching?

Andrew Cooper andrew.cooper3@citrix.com
Thu Feb 8 19:26:00 GMT 2018


Hello,

I realise this is a little bit niche, but how feasible would it be to
introduce a new .nops directive which takes a size parameter, and
outputs long nops covering the number of specified bytes?

For kernel development, when creating alternative patch points for
boot-time instruction/functionality selection, we commonly end up with
the different alternatives having different lengths.  For patching
safety, the compile-time alternative needs to be extended with nops so
the largest alternative can fit in, if we chose to select it.

At the moment, alignment directives have optimisations to pad with long
nops up to the alignment boundary.  However, the alignment properties
are problematic, especially when trying to patch an individual
instruction or two in a hotpath.

At the moment, automatic size calculations can be performed in the
following way:

/*
 * Define an alternative between two instructions. If @feature is
 * present, early code in apply_alternatives() replaces @oldinstr with
 * @newinstr. ".skip" directive takes care of proper instruction padding
 * in case @newinstr is longer than @oldinstr.
 */
.macro ALTERNATIVE oldinstr, newinstr, feature
140:
        \oldinstr
141:
        .skip -(((144f-143f)-(141b-140b)) > 0) *
((144f-143f)-(141b-140b)),0x90
142:

        .pushsection .altinstructions,"a"
        altinstruction_entry
140b,143f,\feature,142b-140b,144f-143f,142b-141b
        .popsection

        .pushsection .altinstr_replacement,"ax"
143:
        \newinstr
144:
        .popsection
.endm

With the .skip directive adding sufficient bytes of single-byte nop
instructions.  While this is functionally correct, it renders the
disassembly unintelligible (especially for longer alternatives, as we've
seen with the Spectre/SP2 mitigations), and comes with runtime
performance hit (singlebyte nops are deliberately not optimised in newer
pipelines to avoid breaking naive timing loops).

The runtime perf hit can be addressed late in boot by re-patching at
runtime with long nops.  However, patching comes with a nonzero chance
of tripping over an NMI/MCE and having the interrupt handler counter a
half-patched instruction, and it would be better to avoid needless
repatching if we possibly can.


Anyway, what I'm trying to say is that having a .nops directive which
could produce an exact number of optimised nops would be very helpful. 
Is it the kind of feature which would be considered useful upstream?

~Andrew



More information about the Binutils mailing list