[PATCH] x86: Don't remove empty x86 properties

Jim Dehnert dehnert@gmail.com
Wed Dec 12 10:38:00 GMT 2018

On Fri, Dec 7, 2018 at 6:45 AM H.J. Lu <hjl.tools@gmail.com> wrote:

> On Thu, Dec 6, 2018 at 10:22 PM Jim Dehnert <dehnert@gmail.com> wrote:
> >
> > Something's been bothering me about much of this discussion, which is
> probably related to Cary's question about intended usage.  We had some
> features analogous to these, for example concerning floating point
> behavior, in the SGI compilers and tools 20+ years ago.  One of the things
> we realized in the process was that there are at least two different
> classes of usage with different desired properties.
> >
> > The first class involves what this discussion is focused on -- the
> ability to catch invalid mixtures of assumptions, e.g., code that assumes
> or requires something (like your X86 features) vs. other code or hardware
> that doesn't satisfy those requirements.  When one catches such a mismatch
> (in the linker or loader, say), one could reject the program, or it might
> be preferable to let it pass and just use the information for better error
> messages if a failure occurs at runtime.  The latter decision might make
> sense if an actual failure is unlikely, but the decision might vary for
> different features, or different tools, or different vendors.
> >
> > For such usage, it is not necessarily critical that the property
> information be available for every component object.  For example, it may
> be known that old objects created before the property became available
> didn't ever require it.  It may also be desirable to just accept objects
> from toolchains that don't (yet) support the property bits, and accept that
> they might fail more dramatically.
> >
> > The second class involves things like cross-object optimization
> decisions, where some global attribute must apply to make the optimization
> valid.  In this case, missing information might produce an incorrect
> optimization.  An example might be a linker decision to include software FP
> support for a simple processor depending on whether any object uses FP.
> >
> > The point of bringing up this distinction is that the same property
> information might be used by different tools or toolchains, or in different
> contexts, in different ways in terms of whether a property is required or
> not, and in terms of whether accurate information is required from all
> component objects.  It's not a good idea to make assumptions when defining
> the property bit treatment about which case applies -- you should just
> carefully define what the bit means, how it's combined when linking objects
> (AND vs. OR), and, importantly, how a downstream consumer knows whether or
> not the information is complete.  In particular, a decision to discard data
> if it's missing from some objects, or if the operation applied yields a
> particular result, can make it impossible for a downstream consumer to tell
> the difference between missing information and combinations yielding 0, and
> it removes the ability of a consumer to do something intelligent for the
> first case.
> >
> > The above are general observations.  There's a more subtle issue in this
> particular proposal that's related (as I understand it).  Because you're
> packing property bits into larger units, and will presumably be adding
> additional properties in the future, you'll want to be able to tell which
> bits are valid in a particular object (e.g., was it created before or after
> the new property was added?).  Otherwise, bits might be zero either because
> the property isn't true, or because it wasn't supported by the tool that
> created the object.  You can't really deal with this using version
> information, because there's no good way to combine versions without
> discarding information.  I'd suggest including a "complete" bit for each
> property bit, meaning that all components of the object provided valid
> values for the property.  Non-complete property bits could still contain
> useful data, but a consumer could not assume it was a valid combination of
> all the component objects.  (This implies that the complete bits are all
> AND bits - the output value should be an AND of all the input values when
> combining objects.)   Note that the separate complete bits removes the need
> for the OR_AND construct, which was trying to combine two separate concepts
> in a single bit.
> Thanks Jim.  Your description pretty much captures my intention.
> OR_AND provides a way to
> mark an object with complete info.  If the bit is 1, the feature
> exists.  If the bit is 0, the feature does
> not exist.  A "complete" bit is an interesting idea.   How should it
> be mapped to NEEDED and USED
> properties?

Well, I look at it from the opposite direction -- I'd prefer to define
mechanism first and then map features to the mechanism as far as possible.
In this case (feature presence), the mechanism needed is just combination
in a linker, and there are two combination operations you need, AND and
OR.  In each case, the output bit is the AND (or OR) of the input bits.
Group the bits so each storage unit (word or dword probably) is uniformly
AND or OR, and the linker can just  do the whole unit at once.

Given that mechanism, both NEEDED and USED properties are simply OR bits --
if any object needs or uses a property, the program does.  (Sort of, more
on that in a bit.)  All "known" bits (switching to Cary's term -- I hadn't
noticed he already had one) are AND bits -- if any input object's value
isn't known, the output's isn't either.

Given these definitions, the linker (or other object-combining tools) have
a simple, well-defined task.  The trickiness comes in defining how
producers are intended to set each bit (the primary property bits, not so
much the known bits).

I have
> [hjl@gnu-cfl-2 ld-plugin]$ readelf -n /bin/ld | head
> Displaying notes found in: .note.gnu.property
>   Owner                 Data size Description
>   GNU                  0x00000020 NT_GNU_PROPERTY_TYPE_0
>       Properties: x86 ISA used: CMOV, SSE, SSE2
> x86 feature used: x86, XMM
> So /bin/ld only uses x86 and XMM with CMOV, SSE, SSE2 and nothing else.
> This info is generated by
> as:
>   -mx86-used-note=[no|yes] (default: yes)
>                           generate x86 used ISA and feature properties
> and linker automatically.   How can I achieve the same with a "complete"
> bit?

In principle, whenever you have turned on generation of the property bit,
you set the associated known bit.  Now, what your OR_AND construct turns
into a one bit is represented by known=1 and property=1 in the output.  But
you can distinguish between the other three cases, all of which get mapped
to zero by your mechanism (or absent, except that as I noted before, you
can't reasonably do absent in the middle of a word of such bits).

Why does this matter?  If one looks at something like your NEEDED bits and
thinks about the contrast with the USED bits...  You commented that NEEDED
must be set by the programmer, but USED could be set by the assembler.  I'm
guessing that you were referring to something like the fact that
instructions requiring a feature could be inside tests that avoid it if the
feature isn't present.  (If that's it, a compiler could do that just as
well as the programmer in most cases.)  But there's a deeper issue.  The
relevant test might not be in the same object file, so that an object that
clearly seems to need NEEDED for a feature, wouldn't actually ever execute
because of guards elsewhere.  In fact, it might not execute at all because
it was dead code.  Ultimately, it's not clear to me that NEEDED tells you
anything new beyond USED unless you simply define it to mean a programmer
assertion that the program shouldn't be loaded if the feature isn't
present, in which case you want OR rather than OR_AND.

My guess is that you're going to want to do some experimentation to decide
what the loader response should be to NEEDED bits (reject the program,
issue a warning and run it, or even just run it and let the runtime use the
bits to guide better runtime error messages).  As I said in the earlier
message, you might also end up with other tools that want to use the
information in different ways (like analysis to figure out whether the
feature is really needed.  In any case, I'd expect that you'll be dealing
with objects with unknown values for a long time, and a definition that
maps all such combinations to zero property bits with OR_AND won't be

Bottom line:  provide both bits (property and known) to avoid backing
yourself into a corner where later developments mean you care about a
different combination.

Hope that's helpful,

> --
> H.J.

             Jim Dehnert

More information about the Binutils mailing list