This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] explicit_bzero final


On Tue, Dec 13, 2016 at 5:46 PM, Jeff Law <law@redhat.com> wrote:
> On 12/13/2016 08:53 AM, Zack Weinberg wrote:
>>
>> It's probably good to loop in some GCC people, but off the top of my
>> head, I think it should be sufficient to have it treat a call to
>> '__glibc_read_memory' as triggering the same special behavior as a
>> call to 'explicit_bzero'.  Of course that would mean
>> __glibc_read_memory can't be used for other cases where we need a
>> "those writes are not dead" optimization fence, so maybe I should
>> rename it __explicit_bzero_fence or something like that.
>
> It's going to be hard for GCC to recognize the memset_chk+read_memory as an
> explict_bzero.  It'll have to infer it from the code pattern, which is
> inherently prone to misses.

Maybe I don't understand what you have in mind, but I don't think you
have to infer anything from the code pattern.  Just treat
__explicit_bzero_fence the same as explicit_bzero itself.  The
arguments are the same and it won't be used for anything else.

I have no idea how hard this would be to implement in present-day GCC,
but the way I imagine explicit_bzero support working is: there's an
__attribute__ for variables (let's call it "sensitive") that triggers
a forward taint analysis: every data object transitively data or
control dependent on the "sensitive" variable also becomes
"sensitive".  This has no effect on data objects whose lifetime
extends beyond the function in which they were declared, but if they
don't, they are erased when their lifetime ends.

Then, the effect of the explicit_bzero and __explicit_bzero_fence
builtins just is to tag the variable whose address they receive with
the "sensitive" attribute; the explicit_bzero builtin also behaves as
__builtin_memset(s, 0, n) (this is in case the sensitive variable is
allocated on the heap or otherwise survives -- it still needs to be
erased when the programmer said it should be erased).  This might need
some sort of _backward_ data flow analysis to figure out which
source-code variable corresponds to the SSA name received by the
builtin, I dunno.

There's probably some wacky corner cases I haven't thought of, but I
don't see why in-glibc expansion of explicit_bzero to memset+fence
should be a problem.

>From the glibc side of things, this inline (with or without
fortification) is highly desirable for two reasons.  Most important,
this is how we mitigate a problem with variables whose address is only
taken in a call to explicit_bzero.  With naive code generation, that
variable might live in CPU registers for its entire lifetime, but then
get copied into RAM just so explicit_bzero can erase it again, which
opens a window where the sensitive data is _more_ exposed than if the
programmer hadn't tried to erase it at all.  With the inline, the data
doesn't get copied out of registers.  (It doesn't get _erased_ from
the registers, but that was true either way.)

We also have a nasty interaction between internal PLT bypass and
ifuncs which means that a hypothetical __explicit_bzero_chk is
ridiculously difficult to implement.  I tried that once already and it
did not go well.

> I'd also worry that this style definition may be prone to goofs.  The keys
> being that if LTO were to see inside __glibc_read_memory and realize that
> the reads are dead and removes the reads.

Right now, LTO into glibc will have all kinds of other negative
consequences (Joseph can explain in more detail) and we don't even try
to support it.  I've carefully documented in the source code that, if
we ever do try to support LTO, __glibc_read_memory (/
__explicit_bzero_fence) must remain opaque to the compiler.

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]