This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Two level comdat priorities in gold


On Tue, Jul 21, 2015 at 12:15 PM, Cary Coutant <ccoutant@gmail.com> wrote:
> I'd prefer to avoid something like two-level comdats, which forces the
> linker into the two-pass approach used for --gc-sections and --icf.
> However, if you do need something like that, I think using a group
> flag in the GRP_MASKOS bit range to identify "weak" comdat groups
> would be preferable to the .gnu.comdat.low section, and would address
> the issue with ld -r. It also ought to be possible to implement such a
> concept without going to two passes; perhaps by maintaining a list of
> weak comdat groups.

Let me quickly summarize the different ideas suggested:

1) Use abi_tag attribute on the COMDAT candidates with specialized instructions

Like Cary pointed out, this is in effect localizing the comdat
instances to that particular module.

Issues with this:
- abi_tag also gets applied to any static variables defined inside a
function with the abi_tag.  This means multiple instances of the
variable exist leading to undesired behavior.  We really only want to
localize the COMDAT function, not the data.
- LLVM does not support abi_tag and does not plan to support it as
robustly as GCC either if I heard right.
- See 3) as this also suffers from the problems with localization.

2) Cary's idea of simply not applying the extra -mxxx flags on
COMDATS, the out of line inlines and the instantiated template bodies.

Unfortunately this idea does not work too for  code like this:

#ifdef __AVX__
inline foo () {
  avx intrnsics
}
...
#else
inline foo() {
  non-avx intrinsics.
}
...
#endif

This means the AVX comdats are already there past the pre-processor
once we use -mavx and we cannot undo this in the compiler.  Example
header : https://bitbucket.org/eigen/eigen/src/6ed647a644b8e3924800f0916a4ce4addf9e7739/Eigen/Core?at=default

This idea will not work in general.

3) Localizing just the COMDAT functions to that specific module where
the extra -mxxx flags are used.

This is similar to abi_tag like mentioned above but only affects code
not data.  However, this may still  not work if comdat function
pointer is taken and used in some manner.  Virtual comdat functions
are still exposed to the problem via vtables.

I have a patch pending for this here :
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01174.html  It
implements -fno-weak-comdat-functions  along the lines of -fno-weak.
This option -fno-weak-comdat-functions solves my problem but the lack
of safety makes me nervous about recommending this.

4) COMDAT priorities

Given the above, I believe this is the safest solution to this
problem.  I had said it is enough to have a binary priority but Ian
did  point out that we may need a multi-level priority.   For our
problem, a multi-level priority may be useful if there is no default
priority comdat  candidate available and we are left to choose among
comdat candidates specialized for different architecture variants.
This can however be solved by forcing default comdat candidates using
-fkeep-inline-functions.

What are your thoughts on this?  Should I go ahead with comdat
priorities like Cary suggested by using the GRP_MASKOS bit range?

Thanks
Sri

>
>>>> While it is possible to construct test cases for this problem using C
>>>> inline functions, in practice the problem is going to arise in C++.
>>>> In C++ it's similar to the problem solved by using ABI tags.  This
>>>> suggests to me that we should have a compiler option allowing an ABI
>>>> tag to be specified for all weak definitions.  As far as I can see
>>>> that would address the entire problem, with no confusion about -r, and
>>>> permitting optimized functions to call optimized versions of the vague
>>>> linkage definitions.
>
> This still might have a problem with virtual functions, if the
> compiler needs to emit a vtable. In that case, the vtable will point
> to the "accidentally optimized" -mxxx function, and if that's the copy
> of the vtable we end up with, everyone is going to call it. You could
> consider extending the ABI tag to the vtable itself, essentially
> creating a new class, but that would still be a problem if we
> construct an instance of the class in the optimized code and return
> that instance to non-optimized code.
>
> It seems to me that the only safe way to do this is to make sure that
> generated templates and out-of-line inlines are generated with
> optimization suitable for the entire program. The downside of not
> calling avx-optimized template functions from avx-optimized code
> doesn't seem that bad -- if a call is performance sensitive enough, it
> should be inlined, in which case the optimizations could apply.
>
> If you could simply disallow virtual functions, either the
> localization approach or the ABI tag approach should work -- they have
> essentially the same effect, except that with ABI tags you could avoid
> the code bloat.
>
> You could also just add a compiler option to suppress generated
> template functions and out-of-line inlines completely, then cross your
> fingers and hope that the needed functions will be available in some
> other object. If you're in control of the libraries that contain these
> avx-optimized functions, that may not be as dangerous as it sounds --
> just add a normally-optimized .o that instantiates all the routines
> needed by the avx-optimized objects.
>
> You mentioned pointer equality, but if that's really an issue, the
> only way I can see to solve that is to make sure you don't apply the
> -mxxx options to the template functions.
>
> -cary


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]