This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Protect _dl_profile_fixup data-dependency order [BZ #23690]


On 09/19/2018 05:48 PM, Tulio Magno Quites Machado Filho wrote:
> The field reloc_result->addr is used to indicate if the rest of the
> fields of reloc_result have already been written, creating a
> data-dependency order.
> Reading reloc_result->addr to the variable value requires to complete
> before reading the rest of the fields of reloc_result.
> Likewise, the writes to the other fields of the reloc_result must
> complete before reloc_result-addr is updated.

Good catch. This needs just a little more work and it's done.

When you're ready please post a v2 and TO me for review.

> 2018-09-19  Tulio Magno Quites Machado Filho  <tuliom@linux.ibm.com>
> 
> 	[BZ #23690]
> 	* elf/dl-runtime.c (_dl_profile_fixup): Guarantee memory
> 	modification order when accessing reloc_result->addr.
> 
> Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
> ---
>  elf/dl-runtime.c | 16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/elf/dl-runtime.c b/elf/dl-runtime.c
> index 63bbc89776..6518e66fd6 100644
> --- a/elf/dl-runtime.c
> +++ b/elf/dl-runtime.c
> @@ -183,9 +183,16 @@ _dl_profile_fixup (
>    /* This is the address in the array where we store the result of previous
>       relocations.  */
>    struct reloc_result *reloc_result = &l->l_reloc_result[reloc_index];
> -  DL_FIXUP_VALUE_TYPE *resultp = &reloc_result->addr;
>  
> -  DL_FIXUP_VALUE_TYPE value = *resultp;
> +  /* CONCURRENCY NOTES:
> +
> +     The following code uses reloc_result->addr to indicate if it is the first
> +     time this object is being relocated.
> +     Reading/Writing from/to reloc_result->addr must not happen before previous
> +     writes to reloc_result complete as they could end-up with an incomplete
> +     struct.  */

This is not quite accurate. The following code uses DL_FIXUP_VALUE_CODE_ADDR to
access a potential member of addr to indicate it is the first time the object is
being relocated. The "guard" variable in this access is either the address itself
or the ip of a function descriptor (the only two implementations of "code addr").

Also we don't explain what other data is being accessed? What non-function-locale
data is being updated as part of the guard variable check?

In the case of the function descriptor it's easy to know that the fdesc's gp is 
going to be written to, and without a acquire we won't see that write, or the new 
ip value, so we might have two threads update the same thing, in theory a benign 
data race which we should fix.

In theory elf_ifunc_invoke() is called and that could write to arbitrary memory
with the IFUNC resolver, whose values should be seen by any future thread via
the new load acquire you are putting in place. You might be able to run the IFUNC
resolver twice which could be bad?


> +  DL_FIXUP_VALUE_TYPE *resultp = &reloc_result->addr;
> +  DL_FIXUP_VALUE_TYPE value = atomic_load_acquire(resultp);

You are potentially requiring an atomic load of a structure whose size can be
an arbitrary size depending on machine and ABI design (64-bit fore 32-bit hppa,
and 128-bit for 64-bit ia64). These architectures might not have such wide
atomic operations, and ...

>    if (DL_FIXUP_VALUE_CODE_ADDR (value) == 0)

... they only actually look at the value returned by DL_FIXUP_VALUE_CODE_ADDR(value).

The "guard" in this case is the IP value in the address or function descriptor
(only two implementations we have in glibc).

You need an atomic load acquire of the guard (ip), but it's hidden behind macros.

>      {
>        /* This is the first time we have to relocate this object.  */
> @@ -346,7 +353,10 @@ _dl_profile_fixup (
>  
>        /* Store the result for later runs.  */
>        if (__glibc_likely (! GLRO(dl_bind_not)))
> -	*resultp = value;
> +	/* Guarantee all previous writes complete before
> +	   resultp (reloc_result->addr) is updated.  See CONCURRENCY NOTES
> +	   earlier  */
> +	atomic_store_release(resultp, value);

Likewise you need an atomic store release of the ip value in the function
descriptor to avoid needing arbitrarily large atomic load/stores.

>      }
>  
>    /* By default we do not call the pltexit function.  */
> 

What I'd like to see:

- Refactor macros to get access to the "code addr" (ip of the fdesc or regular
  addr), and do the atomic loads and stores against that. Or feel free to
  delegate this to more macros at the machine level e.g.
  DL_FIXUP_ATOMIC_LOAD_ACQUIRE_REALLY_LONG_NAME_VALUE_CODE_ADDR (value),
  but I feel this looses out on the fact that we have generic atomic macros
  and so could use something like DL_FIXUP_VALUE_ADDR_CODE which returns
  the address of the code addr to load, and then you operate atomically on
  that to do the load.

- Fix ./sysdeps/hppa/dl-machine.h (elf_machine_fixup_plt) to do an atomic
  store release to update ip. Likewise for ia64. Both should reference the
  concurrency notes in elf/dl-runtime.c.

- Can you write a test for this? A test which starts several threads, barriers
  them all, then starts them, and lets them try to resolve (test is compiled
  lazy binding forced) the plt entries, then repeat for a while, each iteration
  with a different function that hasn't been bound yet (maybe synthetically
  generate a few thousand functions). If we ever see this fail we'll know we
  did something wrong.

And also feel free to tell me I'm wrong :-)

-- 
Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]