[PATCH,V4 10/14] gas: synthesize CFI for hand-written asm

Wed Jan 10 11:26:41 GMT 2024

On 1/10/24 01:44, Jan Beulich wrote:
> On 10.01.2024 07:10, Indu Bhagat wrote:
>> On 1/9/24 01:30, Jan Beulich wrote:
>>> On 08.01.2024 20:33, Indu Bhagat wrote:
>>>> On 1/5/24 05:58, Jan Beulich wrote:
>>>>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>>>>> +/* Check whether a '66H' prefix accompanies the instruction.
>>>>>
>>>>> With APX 16-bit operand size isn't necessarily represented by a 66h
>>>>> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
>>>>> Therefore both the comment and even more so ...
>>>>>
>>>>>> +   The current users of this API are in the handlers for PUSH, POP
>>>>>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>>>>>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>>>>>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>>>>>> +   instruction has a 16-bit operand.  */
>>>>>> +
>>>>>> +static bool
>>>>>> +ginsn_prefix_66H_p (i386_insn insn)
>>>>>
>>>>> ... the function name would better not allude to just the legacy
>>>>> encoding. Maybe ginsn_opsize_prefix_p()?
>>>>>
>>>>
>>>> Isnt 66H_p more readable and easier to follow because that's what the
>>>> function is currently checking ?  If more scenarios were being handled,
>>>> ginsn_opsize_prefix_p () would fit better.
>>>
>>> Well, as said - with APX you can't get away with just 0x66 prefix checking.
>>> That prefix is simply illegal to use with EVEX-encoded insns.
>>>
>>
>> I am using the following in ginsn_opsize_prefix_p ():
>>
>> !(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX] == 0x66
> 
> That addresses one half of my earlier remarks. Note however that elsewhere
> we never check i.prefix[DATA_PREFIX] against being 0x66; we only ever check
> for it being zero or non-zero. I'd like this to remain consistent.
> 
> For EVEX-encoded APX insns this isn't going to be sufficient though. See
> respective code in process_suffix():
> 
> 	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
> 	     needs to be adjusted.  */
> 	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
> 	    {
> 	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
> 	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
> 	    }
> 	  else if (!add_prefix (prefix))
> 	    return 0;
> 
> So you'll need to also check for that combination, plus take care of
> avoiding insns where PREFIX_0X66 alters operation, not operand size
> (ADCX being an example).
> 

But I am now disallowing APX insns for now, altogether, by doing this in 
the x86_ginsn_new:

/* Until it is clear how to handle APX NDD and other new opcodes,   disallow
      them from SCFI.  */
   if (is_apx_rex2_encoding ()
       || (i.tm.opcode_modifier.evex && is_apx_evex_encoding ()))
     {
       as_bad (_("SCFI: unsupported APX op %#x may cause incorrect CFI"),
               opcode);
       return ginsn;
     }

>>>>>> +static ginsnS *
>>>>>> +x86_ginsn_move (const symbolS *insn_end_sym)
>>>>>> +{
>>>>>> +  ginsnS *ginsn;
>>>>>> +  unsigned int dst_reg;
>>>>>> +  unsigned int src_reg;
>>>>>> +  offsetT src_disp = 0;
>>>>>> +  offsetT dst_disp = 0;
>>>>>> +  const reg_entry *dst = NULL;
>>>>>> +  const reg_entry *src = NULL;
>>>>>> +  uint16_t opcode = i.tm.base_opcode;
>>>>>> +  enum ginsn_src_type src_type = GINSN_SRC_REG;
>>>>>> +  enum ginsn_dst_type dst_type = GINSN_DST_REG;
>>>>>> +
>>>>>> +  if (opcode == 0x8b || opcode == 0x8a)
>>>>>
>>>>> Above when handling ALU insns you didn't care about byte ops - why do
>>>>> you do so here (by checking for 0x8a, and 0x88 below)?
>>>>
>>>> Because there may be weird code that saves and restores 8-byte registers
>>>> on stack.  For ALU insns, if the destination is REG_SP/REG_FP, we will
>>>> detect them in the unhandled track.
>>>
>>> You talk about 8-byte registers when I asked about byte reg moves. If
>>> there's a MOV targeting %spl or %bpl, is that really any different from,
>>> say, an ADD doing so?
>>>
>>
>> ATM, Yes. SCFI has heuristics implemented, _some_ of which (can be
>> evolved) are opcode specific. E.g.,
>>     - MOV %rsp, %rbp will make SCFI machinery check if this is making the
>> CFA switch to %rbp.  If the target was %bpl, since we track 8-byte
>> registers, it will still trigger the same code path. A bug as I ack below.
>>     - ADD rsp + 0 = rbp will not trigger CFA switch to RBP. Should it ?
>> Perhaps yes? Its on my TODO list (with low priority) to evolve this bit.
> 
> I'm not sure about this. Either way such special cases may want documenting
> explicitly.
> 

OK.  There is also need to document ginsn, SCFI workings with examples 
of unsupported cases etc.  I will invest some time on this.

>>>>>> +    }
>>>>>> +
>>>>>> +  ginsn_set_where (ginsn);
>>>>>> +
>>>>>> +  return ginsn;
>>>>>> +}
>>>>>
>>>>> Throughout this function (and perhaps others as well), how come you
>>>>> don't consider operand size at all? It matters whether results are
>>>>> 64-bit, 32-bit, or 16-bit, after all.
>>>>
>>>> Operation size matters: No, not for all instructions in context of SCFI.
>>>> For instructions using stack (save / restore), the size is important.
>>>> But for other instructions, operation size will not affect SCFI correctness.
>>>
>>> But aren't you wrongly treating an update of %rbp and an update of,
>>> say, %bp the same then? The latter should end up as untracable, aiui.
>>>
>>
>> For ALU ops:
>>     - ADD/SUB reg1, reg2
>>       If reg2 is the same as REG_CFA, then even in 64-bit mode, this
>> causes untraceability. So there is untraceability irrespective of
>> operation size. On the other hand, if uninteresting, it will remain
>> unintersting irrespective of operation size.
>>     - ADD/SUB imm, reg will never trigger untraceability, irrespective of
>> size. There is the element of ignoring the carry bit, but I think the
>> current diagnostics around "asymmetrical save/restore" and the planned
>> "unbalanced stack at return" should provide user with some safety net.
> 
> I don't see how the carry flag would matter here. What does matter is the
> size of the register: Anything not 64-bit will need to trigger
> untraceability imo.
> 

  - For a 16-bit ADD/SUB imm, reg insn: if the 16-bit result had a 
carry, it will be ignored as only 16-bit result is copied to the 
destination address IIUC.  If there is no carry bit, 16-bit ADD/SUB imm, 
reg is semantically the same insn as a 64-bit ADD/SUB imm, reg for the 
same immediate.
  - For 32-bit ADD/SUB imm, reg insn: yes, I stand corrected. It should 
trigger untraceability as upper 32-bits are zeroed out. IMO, this 32-bit 
operation should likely be unintentional by the user; something better 
alerted to the user if we can.

I will come back to this item of tracking operation sizes when I fix the 
MOV issue below.

>>     - Other ALU ops should all trigger untracebility alike, irrespective
>> of operation size.
>> Hence, my statement that ignoring operation size is fine here for SCFI.
> 
> As per above, I disagree.
> 
>> For MOV ops:
>>     - 8-bit/16-bit MOV should trigger untracebility. I ack this as bug to
>> be fixed. Thanks to your careful review. ATM, I plan to deal with this
>> after the series goes in, unless you have strong opinion about this.
> 
> 32-bit MOV should, too. And yes, I'm okay-ish with such being dealt with
> later, as long as the open work item is easily recognizable (and hence
> it'll be easy to check that all such work items are gone at the point
> where the feature is promoted from experimental).
> 

Right, 32-bit MOV should too.

>>>>>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
>>>>>> +   machine instruction.
>>>>>> +
>>>>>> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
>>>>>> +   if failure.
>>>>>> +
>>>>>> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
>>>>>> +   ginsns necessary for correctness of any passes applicable for that mode.
>>>>>> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
>>>>>> +   machine instructions that must be translated into the corresponding ginsns
>>>>>> +   to ensure correctness of SCFI:
>>>>>> +     - All instructions affecting the two registers that could potentially
>>>>>> +       be used as the base register for CFA tracking.  For SCFI, the base
>>>>>> +       register for CFA tracking is limited to REG_SP and REG_FP only for
>>>>>> +       now.
>>>>>> +     - All change of flow instructions: conditional and unconditional branches,
>>>>>> +       call and return from functions.
>>>>>> +     - All instructions that can potentially be a register save / restore
>>>>>> +       operation.
>>>>>
>>>>> This could do with being more fine grained, as "potentially" is pretty vague,
>>>>> and (as per earlier version review comments) my take on this is a much wider
>>>>> set than yours.
>>>>
>>>> I would like to understand more on this comment, especially the "my take
>>>> on this is a much wider set than yours".  I see its being hinted at in
>>>> different flavors in the current review.
>>>>
>>>> I see some issues pointed out in this review (addressing modes of mov
>>>> etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your
>>>> concerns are wider than this.
>>>
>>> I earlier version review I mentioned that even vector or mask registers
>>> could in principle be use to hold preserved GPR values. I seem to recall
>>> that you said you wouldn't want to deal with such. Hence my use of
>>> "wider set": Just to give an example, "kmovq %rbp, %k0" plus later
>>> "kmovq %k0, %rbp" is a pair of "instructions that can potentially be a
>>> register save / restore operation".
>>>
>>
>> Hmm. I will need to understand them on a case to case basis.  For the
>> case of "kmovq %rbp, %k0" / "kmovq %k0, %rbp" how can this be used as
>> save/restore to/from stack ?
> 
> Maybe I'm still not having a clear enough picture of what forms of insns
> you want to fully track. Said insn forms don't access the stack. But they
> could in principle be used to preserve a certain register. Such preserving
> of registers is part of what needs encoding in CFI, isn't it?
> 

The kind of preserving is usually on stack. It can also be in another 
callee-saved register, in theory, but the latter defeats the purpose of 
state saving across calls.

BTW, I had earlier written down some notes about SCFI 
https://sourceware.org/pipermail/binutils/2023-September/129558.html
Some bits are stale already though, but may be it helps.

>>>>>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>>>>>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>>>>>> +			      GINSN_DST_REG, dw2_regnum);
>>>>>> +      ginsn_set_where (ginsn);
>>>>>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>>>>>> +				  GINSN_SRC_REG, REG_SP, 0,
>>>>>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>>>>>> +				  GINSN_DST_REG, REG_SP, 0);
>>>>>> +      ginsn_set_where (ginsn_next);
>>>>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>>>> +      break;
>>>>>> +
>>>>>> +    case 0xff:
>>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>>> +	break;
>>>>>> +      /* push from mem.  */
>>>>>> +      if (i.tm.extension_opcode == 6)
>>>>>> +	{
>>>>>> +	  /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>>>>> +	  if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>>>>> +	    stack_opnd_size = 2;
>>>>>> +	  ginsn = ginsn_new_sub (insn_end_sym, false,
>>>>>> +				 GINSN_SRC_REG, REG_SP, 0,
>>>>>> +				 GINSN_SRC_IMM, 0, stack_opnd_size,
>>>>>> +				 GINSN_DST_REG, REG_SP, 0);
>>>>>> +	  ginsn_set_where (ginsn);
>>>>>> +	  /* These instructions have no imm, only indirect access.  */
>>>>>> +	  gas_assert (i.base_reg);
>>>>>> +	  dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>>>> +	  ginsn_next = ginsn_new_store (insn_end_sym, false,
>>>>>> +					GINSN_SRC_INDIRECT, dw2_regnum,
>>>>>> +					GINSN_DST_INDIRECT, REG_SP, 0);
>>>>>> +	  ginsn_set_where (ginsn_next);
>>>>>> +	  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>>>> +	}
>>>>>> +      else if (i.tm.extension_opcode == 4)
>>>>>> +	{
>>>>>> +	  /* jmp r/m.  E.g., notrack jmp *%rax.  */
>>>>>> +	  if (i.reg_operands)
>>>>>> +	    {
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>> +	    }
>>>>>> +	  else if (i.mem_operands && i.index_reg)
>>>>>> +	    {
>>>>>> +	      /* jmp    *0x0(,%rax,8).  */
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.index_reg);
>>>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>
>>>>> What if both base and index are in use? Like for PUSH/POP, all addressing
>>>>> forms are permitted here and ...
>>>>>
>>>>>> +	    }
>>>>>> +	  else if (i.mem_operands && i.base_reg)
>>>>>> +	    {
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>> +	    }
>>>>>> +	}
>>>>>> +      else if (i.tm.extension_opcode == 2)
>>>>>> +	{
>>>>>> +	  /* 0xFF /2 (call).  */
>>>>>> +	  if (i.reg_operands)
>>>>>> +	    {
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>> +	    }
>>>>>> +	  else if (i.mem_operands && i.base_reg)
>>>>>> +	    {
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>> +	    }
>>>>>
>>>>> ... here.
>>>>
>>>> For the indirect jump and call instructions, the target destination is
>>>> not necessary to be known.  Indirect jumps will cause SCFI to error out
>>>> as control flow cannot be constructed.
>>>>
>>>> Call instructions are assumed to transfer control out of function.
>>>>
>>>> In other words, more information in these cases will not be of use to SCFI.
>>>
>>> But then aren't you already doing too much work here? With what you say,
>>> you don't care about the kind of operand at all, merely the fact it's a
>>> CALL might be interesting. Albeit even there I'd then wonder why - if
>>> the function called isn't of interest, how is CALL different from just
>>> NOP? The only CALL you'd need to be concerned about would be the direct
>>> form with a displacement of 0, as indicated elsewhere. But of course
>>> tricky code might also use variants of that; see e.g. the retpoline code
>>> that was introduced into kernels a couple of years back. Recognizing
>>> and bailing on such tricky code may be desirable, if not necessary.
>>>
>>
>> Tracking operands for CALL instructions does not provide value ATM.  We
>> do not even split a BB if there is a CALL instruction (the assumption is
>> that CALL insn transfers control out of function).
>>
>> You're right that we need to treat some CALLs differently (because of
>> its affect of control flow and SCFI correctness).
>>
>> RE: doing more work. Sure, but the vision for ginsn was to allow a
>> representation where other analyses may be added. The representation of
>> ginsn will need revisions to get there, but keeping an explicit
>> GINSN_TYPE_CALL seems good.
> 
> That's okay, but with as little dead (for now) code as possible. If the
> operand isn't interesting, why deal with the various operand forms right
> now?
> 

I dont have a good answer to this. If I dont add opnds to 
GINSN_TYPE_CALL now, I will have to put some dummy default args (which I 
personally find confusing to see; I already have some in the code 
elsewhere for RegIP/RegIZ). And yes, we have disagreed on this matter 
before too.

>>>>>> +	}
>>>>>> +      break;
>>>>>> +
>>>>>> +    case 0xc2:
>>>>>> +    case 0xc3:
>>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>>> +	break;
>>>>>> +      /* Near ret.  */
>>>>>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>>>>>> +      ginsn_set_where (ginsn);
>>>>>> +      break;
>>>>>
>>>>> No tracking of the stack pointer adjustment?
>>>>
>>>> No stack unwind information for a function is relevant after the
>>>> function has returned.  So, tracking of stack pointer adjustment by
>>>> return is not necessary.
>>>
>>> What information does the "return" insn then carry, beyond it being
>>> an unconditional branch (which you have a different insn for)?
>>>
>>
>> "return" does not carry any more information than just the
>> GINSN_TYPE_RETURN as ginsn->type.
>>
>> So then why support both "return" and an unconditional branch: The
>> intention is to carry the semantic difference between ret and
>> unconditional jump.  Unconditional jumps may be to a label within
>> function, and in those cases, we use it for some validation and BB
>> linking when creating CFG. Return, OTOH, always indicates exit from
>> function.
>>
>> For SCFI purposes, above is the one use.  Future analyses may find other
>> use-cases for an explicit return ginsn.  But IMO, keeping
>> GINSN_TYPE_RETURN as an explicit insn makes the overall offering cleaner.
> 
> Okay. And here you don't bother decoding operands. Hence why I'm
> asking the same to be the case for (e.g.) CALL.
> 

It seems I will need to deal with operands of RETURN insn soon.  For 
implementing "Warn if imbalanced stack at return", we will need this info.

>>>>>> @@ -5830,6 +6804,14 @@ md_assemble (char *line)
>>>>>>       /* We are ready to output the insn.  */
>>>>>>       output_insn (last_insn);
>>>>>>     
>>>>>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>>>>>> +     performed in i386_target_format.  */
>>>>>
>>>>> See my earlier comment - it's yet more restrictive (as in not covering
>>>>> e.g. the Windows ABI, which importantly is also used in EFI).
>>>>
>>>> Not clear, can you please elaborate ?
>>>
>>> Hmm, it's not clear to me what's not clear in my earlier comment. The
>>> set of preserved registers, for example, differs between the SysV and
>>> the Windows ABIs (see x86_scfi_callee_saved_p()). Then again, thinking
>>> of it, talking of anything ABI-ish in assembly code is questionable.
>>> An assembly function calling another assembly function may use an
>>> entirely custom "ABI". You just cannot guess upon preserved registers.
>>>
>>
>> I think the confusion is stemming from my usage of AMD64 colloquially to
>> refer to System V ABI for x86_64. I think "System V AMD64 ABI" is the
>> correct term. I will use that.
>>
>> And yes, GAS can only work out SCFI if there is ABI adherence.  If input
>> asm does not adhere to the supported ABIs, and the user invokes SCFI,
>> then the user is on their own.  GAS will not (rather can not) warn about
>> ABI incompliance.
> 
> I don't recall documentation stating this explicitly. This is a pretty
> important aspect for users to consider, after all.
> 

I added a reference in gas/NEWS for now.

* Experimental support in GAS to synthesize CFI for ABI-conformant,
   hand-written asm using the new command line option 
--scfi=experimental  on x86-64.

I will add a mention of "ABI conformance" in as.texi.