This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Two level comdat priorities in gold


On Tue, Jul 21, 2015 at 12:15 PM, Cary Coutant <ccoutant@gmail.com> wrote:
> I'd prefer to avoid something like two-level comdats, which forces the
> linker into the two-pass approach used for --gc-sections and --icf.
> However, if you do need something like that, I think using a group
> flag in the GRP_MASKOS bit range to identify "weak" comdat groups
> would be preferable to the .gnu.comdat.low section, and would address
> the issue with ld -r. It also ought to be possible to implement such a
> concept without going to two passes; perhaps by maintaining a list of
> weak comdat groups.
>
>>>> While it is possible to construct test cases for this problem using C
>>>> inline functions, in practice the problem is going to arise in C++.
>>>> In C++ it's similar to the problem solved by using ABI tags.  This
>>>> suggests to me that we should have a compiler option allowing an ABI
>>>> tag to be specified for all weak definitions.  As far as I can see
>>>> that would address the entire problem, with no confusion about -r, and
>>>> permitting optimized functions to call optimized versions of the vague
>>>> linkage definitions.
>
> This still might have a problem with virtual functions, if the
> compiler needs to emit a vtable. In that case, the vtable will point
> to the "accidentally optimized" -mxxx function, and if that's the copy
> of the vtable we end up with, everyone is going to call it. You could
> consider extending the ABI tag to the vtable itself, essentially
> creating a new class, but that would still be a problem if we
> construct an instance of the class in the optimized code and return
> that instance to non-optimized code.
>
> It seems to me that the only safe way to do this is to make sure that
> generated templates and out-of-line inlines are generated with
> optimization suitable for the entire program. The downside of not
> calling avx-optimized template functions from avx-optimized code
> doesn't seem that bad -- if a call is performance sensitive enough, it
> should be inlined, in which case the optimizations could apply.

This seems like an awesome idea to me.  For example, if I added option
"-mavx-safe", the comdat safe counterpart of -mavx that would not
apply AVX codegen to comdats, I am done.  This will get us what we
wanted with comdat priorities.


>
> If you could simply disallow virtual functions, either the
> localization approach or the ABI tag approach should work -- they have
> essentially the same effect, except that with ABI tags you could avoid
> the code bloat.

I agree that ABI tags and localization have the same effect but there
is an important difference.  With localization, I was only going to
make the comdat functions local, not the data.  With ABI tags, I
cannot do this.  Example:

__attribute__((abi_tag("avx")))
inline int foo ()
{
  static int a;
  a++
  printf (" Function called %d times\n", a);
}


The abi_tag also gets applied to variable a too which  will break a
lot of existing code (multiple copies of a) since this use of static
variable is pretty common.  However, with localization, I would only
localize the function not the comdat group containing variable a.  So
while there is more than one  copy of foo there is only one copy of a
and the code still behaves as expected.

I have a patch pending review doing this in the GCC here via option
-fno-weak-functions
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01174.html

>
> You could also just add a compiler option to suppress generated
> template functions and out-of-line inlines completely, then cross your
> fingers and hope that the needed functions will be available in some
> other object. If you're in control of the libraries that contain these
> avx-optimized functions, that may not be as dangerous as it sounds --
> just add a normally-optimized .o that instantiates all the routines
> needed by the avx-optimized objects.
>
> You mentioned pointer equality, but if that's really an issue, the
> only way I can see to solve that is to make sure you don't apply the
> -mxxx options to the template functions.
>
> -cary


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]