[PATCH] x86-64: Restore LD_PREFER_MAP_32BIT_EXEC support [BZ #28656]

H.J. Lu hjl.tools@gmail.com
Mon Aug 8 17:02:08 GMT 2022


On Mon, Aug 8, 2022 at 6:29 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * H. J. Lu:
>
> > On Tue, Aug 2, 2022 at 1:00 AM Florian Weimer <fweimer@redhat.com> wrote:
> >>
> >> * H. J. Lu via Libc-alpha:
> >>
> >> > Crossing 2GB boundaries with indirect calls and jumps can use more
> >> > branch prediction resources on several Intel CPUs.  There is visible
> >> > performance improvement on workloads with many PLT calls when executable
> >> > and shared libraries are mmapped below 2GB.  Add the Prefer_MAP_32BIT_EXEC
> >> > bit so that mmap will try to map executable or denywrite pages with
> >> > MAP_32BIT first.
> >> >
> >> > NB: Prefer_MAP_32BIT_EXEC reduces bits available for address space
> >> > layout randomization (ASLR), which is always disabled for SUID programs
> >> > and can only be enabled by setting environment variable,
> >> > LD_PREFER_MAP_32BIT_EXEC.
> >>
> >> If the performance benefits are significant, this should be handled at
> >> the kernel level.  Only the kernel can put the main program, ld.so and
> >> the vDSO into the same 2GB window (presumably with the main program at
> >> the top, so that the heap can grow almost indefinitely).
> >
> > ld.so and vDSO aren't performance sensitive.  But we need to handle PIE.
>
> I don't think this is necessarily true.  It depends on execution
> profile.

True.

> clock_gettime in the vDSO could certainly matter to some workloads.
>
> >> For mapping shared objects, we can give the kernel a hint that they will
> >> eventually contain an executable mapping.  If the kernel could reuse
> >> MAP_DENYWRITE for that, no glibc changes would be needed after all.
> >>
> >> Doing this is in glibc is only a very partial solution, and so I'd
> >> appreciate if it could be fixed properly in the kernel.
> >>
> >
> > There is no easy way for kernel to selectively mmap PIE with MAP_32BIT.
> > Can ld.so re-exec PIE with "ld.so PIE" so that ld.so can mmap PIE with
> > MAP_32BIT?
>
> In theory, yes, but that still leaves the vDSO issue.  The kernel could
> cover that as well.

Kernel changes may not be easy.  Glibc changes can cover most of
performance issues.   However, "ld.so PIE" may be difficult to debug.
Is that possible for ld.so to unmap PIE and map PIE with MAP_32BIT?

> Regarding the performance issue, does everything have to be in the first
> 2 GiB or 4 GiB, or is it sufficient if everything is in the same
> +/- 2 GiB window?

This doesn't apply since the issue is with indirect calls and jumps.

-- 
H.J.


More information about the Libc-alpha mailing list