This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] explicit_bzero final


On Wed, Dec 14, 2016 at 11:58 AM, Jeff Law <law@redhat.com> wrote:
> On 12/13/2016 06:04 PM, Zack Weinberg wrote:
>> On Tue, Dec 13, 2016 at 5:46 PM, Jeff Law <law@redhat.com> wrote:
>>>
>>> It's going to be hard for GCC to recognize the memset_chk+read_memory as
>>> an explict_bzero.  It'll have to infer it from the code pattern, which is
>>> inherently prone to misses.
>>
>> Maybe I don't understand what you have in mind, but I don't think you
>> have to infer anything from the code pattern.  Just treat
>> __explicit_bzero_fence the same as explicit_bzero itself.  The
>> arguments are the same and it won't be used for anything else.
>>
>> I have no idea how hard this would be to implement in present-day GCC,
>> but the way I imagine explicit_bzero support working is: there's an
>> __attribute__ for variables (let's call it "sensitive") that triggers
>> a forward taint analysis: every data object transitively data or
>> control dependent on the "sensitive" variable also becomes
>> "sensitive".  This has no effect on data objects whose lifetime
>> extends beyond the function in which they were declared, but if they
>> don't, they are erased when their lifetime ends.
>
> The concept of tainting variables and tracking that property seems distinct
> from making sure that the compiler properly handles explicit_bzero.  And
> just to be clear, I see that property as having value.

The thing is, if you do the taint tracking, that's _all_ you need to
do, I think.

>> There's probably some wacky corner cases I haven't thought of, but I
>> don't see why in-glibc expansion of explicit_bzero to memset+fence
>> should be a problem.
>
> My worry is ensuring that the compiler doesn't ever remove these stores as
> dead.

You don't have to worry about the stores generated by memset+fence
because they will be redundant to stores generated by the taint
tracker.

Let's recap the motivating example for the inline, since it's been
awhile and Jeff may not have seen it:

```
typedef __SIZE_TYPE__ size_t;

extern void *memset (void *, int, size_t);
extern void explicit_bzero (void *, size_t);
extern void __explicit_bzero_fence (const void *, size_t);

#ifdef INLINE_EXPLICIT_BZERO
extern inline __attribute__ ((gnu_inline)) void
explicit_bzero(void *p, size_t n)
{
  memset (p, 0, n);
  __explicit_bzero_fence (p, n);
}
#endif

// 128 bits of key material
struct k { unsigned long long lo, hi; };

extern struct k get_key ();
extern void do_encrypt (struct k, char *, const char *, size_t);

void
encrypt (char *out, const char *in, size_t n)
{
  struct k key = get_key ();
  do_encrypt (key, out, in, n);
#ifndef NO_CLEAR
  explicit_bzero (&key, sizeof (struct k));
#endif
}
```

If you compile this on x86-64 with gcc 6.2 at -DNO_CLEAR -O2 -S, you
will see that the `key` variable lives entirely in registers.  If you
compile it with neither `-DNO_CLEAR` nor `-DINLINE_EXPLICIT_BZERO`,
`key` is spilled to the stack immediately after the call to `get_key`
(it's more obvious that this is what's happening if you turn off RTL
scheduling). This is what we want to avoid.  With
`-DINLINE_EXPLICIT_BZERO` ... the spill still happens,
disappointingly, but it _could have_ been eliminated, because those
stack slots are not read again before they are overwritten with zeroes
by the memset.  (LLVM 3.8 does eliminate the spill.)

Anyway, right before lowering to RTL, the IR with -DINLINE_EXPLICIT_BZERO is

encrypt (char * out, const char * in, size_t n)
{
  struct k key;

  <bb 2>:
  key = get_key ();
  do_encrypt (key, out_3(D), in_4(D), n_5(D));
  memset (&key, 0, 16);
  __explicit_bzero_fence (&key, 16);
  key ={v} {CLOBBER};
  return;
}

With a compiler that implemented the taint analysis I am imagining, we
would have instead

encrypt (char * out, const char * in, size_t n)
{
  struct k key __attribute__ ((tainted));

  <bb 2>:
  key = get_key ();
  do_encrypt (key, out_3(D), in_4(D), n_5(D));
  memset (&key, 0, 16);
  key = {0, 0};
  USE (key);
  key ={v} {CLOBBER};
  return;
}

where "USE (key)" has the semantics of RTL (use ...) - I don't
remember what the tree equivalent is.  No assembly generated, but
preceding stores are preserved.

Now, hopefully the RTL optimizers will figure out that the memset is
unnecessary and remove it, but even if they don't, the codegen is
still correct.

(Alas, looking at the assembly here has made me realize that this is
maybe harder to implement than I thought; it probably can't be done
just by slapping in some writes and USEs at tree level; it needs to
work out which of the _hard registers_ and _stack slots_ are tainted,
and generate the appropriate clears, _after_ register allocation.  And
it needs to assume that function calls _don't_ clobber call-clobbered
registers.  Oh my aching head.)

> Understood, WRT LTO today.  I'm not worried about today, I'm worried about
> what happens in the future when the assumptions we're making today about
> what the compiler can and can not do may no longer hold.

The thing is that until substantial work is done on both libc and the
compiler, we can't _allow_ LTO to see into libc.  This is not the only
place where we rely on LTO not seeing through a function-call boundary
for correctness.

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]