This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Alternative libio vtable hardening approach

On 06/01/2016 02:25 AM, Carlos O'Donell wrote:
On 05/31/2016 11:56 AM, Florian Weimer wrote:
On 05/31/2016 05:09 PM, Carlos O'Donell wrote:

The question I'm asking myself is: Should we be pointer mangling any of
these pointers to make it harder to scan for the table?

I'm not sure what you are trying to address. The table itself should
be in RELRO memory, and its address should be at a fixed offset
relative to the PC at the site of the check, so all that should be
pretty safe.

That's exactly the question I'm asking.

If I understand your reasoning:

(a) The table is in .rodata/ (verified present tables are in .rodata)

(b) The table is always PC-relative addressable from ROP gadgets
    within the library so finding it is easy.
    - Hiding the pointers with mangling doesn't make this harder.

I don't get the âscanningâ and âROPâ parts. :)

(c) Mangling degrades performance.

Not just that, it makes compatibility so much harder. (We would have to unmangle vtables if needed.)

I'm only worried about the flag that turns of the check.

I don't see any way to provide backwards compatibility than to disable
the check with a global.

We could encrypt the value of the global to make it harder to unset?

Mike suggests this too.

Yes, it's a good idea.

What we really want is a set of internal initializers that are run during
static dlopen / dlmopen that allow the library to copy key state into the
other namespace.

We don't need pass across any actual data, or affect relocation, so
it's easier. malloc in does the following to detect whether
it's running across a linking boundary:

#ifdef SHARED
  /* In case this libc copy is in a non-default namespace, never use brk.
     Likewise if dlopened from statically linked program.  */
  Dl_info di;
  struct link_map *l;

  if (_dl_open_hook != NULL
      || (_dl_addr (ptmalloc_init, &di, &l, NULL) != 0
          && l->l_ns != LM_ID_BASE))
    __morecore = __failing_morecore;

Ah, OK, that answers my question. You lookup your own link map and then
determine you're not in the default namespace and reset the variables.
That's a sufficient solution. Thanks.

I also found this, which seems to cover the same thing:

/* Set nonzero if we have to be prepared for more than one libc being
   used in the process.  Safe assumption if initializer never runs.  */
int __libc_multiple_libcs attribute_hidden = 1;

The malloc code doesn't use it, maybe because it came from out-of-tree ptmalloc.

Would be csu/init-first.c the right place to perform the multiple-libc check and set the flag? Does it run for both static and dynamic binaries?

Again, my thoughts were along the lines of some kind of callback
registration process where the old structure pointers are registered
along with a callback.

Based on our experience with the fork handlers, I think explicit calls across subsystems are the way to go. It's the best way to express ordering requirements.

     a4c:       48 8b 83 d8 00 00 00    mov    0xd8(%rbx),%rax
     a53:       48 29 f2                sub    %rsi,%rdx
     a56:       48 81 78 38 00 00 00    cmpq $_IO_file_xsputn_fast,0x38(%rax)
     a5e:       0f 85 e6 11 00 00       jne    1c4a
     a64:       e8 00 00 00 00          callq  _IO_file_xsputn_fast
    1c4a:       e8 00 00 00 00          callq  _IO_file_xsputn_slow
    1c4f:       e9 15 ee ff ff          jmpq   a69

This expansion has much more reasonable size, but only makes sense if
the call site is monomorphic in practice.

I don't follow. The slow path can still use the vtables as required by
legacy binaries and the call site there can be polymorphic? Maybe I've
misunderstood the suggestion.

It's just a matter of performance. The comparison against the expected pointer is worthwhile if the in almost all cases, there is a match and you hit the fast path. If it's just 50/50 or something like that (as it currently would be because fprintf and sprintf use different xsputn implementations), it does not improve performance. Calling the slow path directly would be preferable.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]