Summary: | Export basic metadata about ABI compatibility | ||
---|---|---|---|
Product: | glibc | Reporter: | Nathaniel J. Smith <njs> |
Component: | libc | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED NOTABUG | ||
Severity: | enhancement | CC: | carlos, drepper.fsp, fweimer, jsm-csl, zack+srcbugz |
Priority: | P2 | Flags: | fweimer:
security-
|
Version: | unspecified | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: |
Description
Nathaniel J. Smith
2019-06-05 06:18:48 UTC
On Wed, 5 Jun 2019, njs at pobox dot com wrote: > Using gnu_get_libc_version() like this is a bit awkward, especially because in > practice some vendors like to stick random strings in there. (We have an > empirically-derived regex, basically just matching "2\.[0-9]+" and then > ignoring anything after the first non-numeric digit.) The right place for vendor-specific information is in PKGVERSION (configure --with-pkgversion=<something>), which affects the banner you get when you run libc.so.6, but not the result of gnu_get_libc_version. > After discussing it with him, I'd like to propose that glibc add a new > function: > > void gnu_get_libc_abi_levels(int *build_abi, int *min_runtime_abi, int > *max_runtime_abi) > > Basically this is a function that returns 3 integers. build_abi specifies the > "ABI level" of binaries built against this glibc. min_runtime_abi and > max_runtime_abi specify the supported "ABI levels" of binaries run against this > glibc. I'm not convinced ABI levels are a defined concept like that, at least not as integers. (We have symbol versions, and an ordering relation between them. The particular set of symbol versions may depend on the architecture. Up to 2.3.x there were sometimes new symbol versions in point releases. Although we don't currently do point releases, and haven't had new symbol versions in them for a very long time, it's not obvious to me that there will never be a case in future for doing point releases and adding symbol versions in them. Say, if some security issue shows up an API design issue and it's concluded to be important to add a new API quickly including into older versions.) The min_runtime_abi concept is questionable. We removed the --enable-oldest-abi option years ago as bitrotten (bug 6652). Any suggested slightly-incompatible changes would *not* remove GLIBC_2.0 symbols in general; they might very selectively remove certain compatibility features that quite likely could not be associated with a symbol version at all, and would not have anything we could define in advance as an ABI level that might later be removed (there wouldn't be a total ordering between such compatibility features, either, which rather prevents defining their presence or absence by such a minimum ABI level). The nearest thing we have to a minimum ABI level is the minimum symbol version - but any change of that is *completely* incompatible (replaces symbol versions for every symbol at the old version, so would indicate a new SONAME or dynamic linker name). What *is* clearly defined is the __GLIBC__ and __GLIBC_MINOR__ integer values, so there could be a C API to provide those (if such an API is useful). The comparison should not treat 3.0 as being incompatible with 2.x, just as being later (as any slightly-incompatible changes without change of SONAME would be such that very few binaries would be likely to be affected, and any (unlikely) change of SONAME would mean a program built for a different SONAME of glibc simply wouldn't run). > The right place for vendor-specific information is in PKGVERSION > (configure --with-pkgversion=<something>), which affects the banner you > get when you run libc.so.6, but not the result of gnu_get_libc_version. That's nice to know, but it turns out not all vendors got the message, and some of their users use Python... https://github.com/pypa/pip/issues/3588 > I'm not convinced ABI levels are a defined concept like that, at least not as integers. I see what you mean. Symbol versions are great, and definitely give more fine grained information. But dumping full symbol version information into every package's metadata doesn't work very well. The idea of an "ABI level" is to provide a shorthand name for some common collections of symbol versions. And pragmatically speaking, there are millions of systems relying on __GLIBC_MINOR__ to act as an ABI level right now, so they're defined in that sense :-). Fortunately, I don't think we need them to do as much as symbol versions do, so it can work. Let's imagine a hypothetical 2.40 that turns out to have urgent bugs, so we end up with a 2.40.1 release that includes some @GLIBC_2.41 symbols. The easy case is where you also rush out 2.41 with the exact same symbols as 2.40.1. In this case, it just means that binaries built against 2.40.1 require ABI level 41, and 2.40.1 supports ABI levels [0, 41]. The trickier case is where the final 2.41 includes other new symbols that didn't make it into 2.40.1. In that case, I guess the best the metadata could do was say that binaries built against 2.40.1 have ABI level 41, and that 2.40.1 supports ABI levels [0, 40]. Which looks really odd, because it seems to suggest that if you build against 2.40.1, then you might not be able to run against 2.40.1. But if you obey what it's telling you, you will end up with a working configuration; it just isn't quite precise enough to tell you about some other configurations that would also work. This would be totally fine for us – we're OK with occasionally not installing a binary that would have worked, so long as we avoid installing binaries that don't work. So if we end up only installing 2.40.1 binaries on systems that have 2.41, that's no big deal. Also, when building binaries we have a lot of control over the environment: we get to choose the distro, compiler, we have tooling that audits symbol versions, etc. So we'd probably just not use 2.40.1 for builds. Where we really need reliable automated rules is in the installer, since that runs on random end users systems that we don't control, including systems that haven't been released yet. So our core requirement is: some metadata we can query at runtime, that gives a conservative estimate of what binaries will work with the current glibc. We wouldn't actually use the build_abi runtime query API; the crucial thing would be for the the glibc devs to have chosen a build_abi value for each release, so that later on the runtime_abi has a way to talk about that release. > The min_runtime_abi concept is questionable This would be different from --enable-oldest-abi, because it wouldn't be configurable at build time; it'd just be taking whatever decision you'd made about abi compatibility, and making it more visible to the rest of us. And it's different from changing the minimum symbol version, because it doesn't refer to symbols, it refers to binaries built against those old versions. So bumping the min_runtime_abi to, say, 3, would just mean "if a binary was built against glibc 2.2 or earlier, we no longer guarantee that it works". I don't know if that's the best approach – really what we want is the glibc devs to make a commitment about how to tell which glibc versions are compatible with which other glibc versions, that's precise enough to encode in software. What exactly that commitment might look like is a policy question for glibc; my proposal was just a guess based on what Zack told me :-). An API to fetch __GLIBC__ and __GLIBC_MINOR__ at runtime would be somewhat useful, because it would let us stop parsing strings, but it doesn't really touch on the "commitment" part. Your statement here in the tracker about what a hypothetical future glibc 3.0 would mean is definitely helpful, but I do wonder a little whether the glibc devs of 2029 will feel themselves bound to match a one-off comment in an issue from 2019. And maybe if you had a way to tell software like ours *which* binaries were broken by 3.0, instead of having to handwave and say "probably not many, don't worry about it", then that would make it easier to ship 3.0? I don't have a strong conclusion here; these are just things I'm thinking about. On Fri, 7 Jun 2019, njs at pobox dot com wrote: > Let's imagine a hypothetical 2.40 that turns out to have urgent bugs, so we end > up with a 2.40.1 release that includes some @GLIBC_2.41 symbols. In that case I expect we'd use @GLIBC_2.40.1 instead. > This would be different from --enable-oldest-abi, because it wouldn't be > configurable at build time; it'd just be taking whatever decision you'd made > about abi compatibility, and making it more visible to the rest of us. And it's > different from changing the minimum symbol version, because it doesn't refer to > symbols, it refers to binaries built against those old versions. So bumping the > min_runtime_abi to, say, 3, would just mean "if a binary was built against > glibc 2.2 or earlier, we no longer guarantee that it works". But that's simply not how slightly-incompatible changes work in practice. It's not generally removing features that might only be used by binaries built with a given glibc version. It's removing features that might only be used by C++ binaries built with GCC 2.95 or earlier, for example (independent of the glibc version they were built with). Or removing features that might be used by binaries built with any glibc version, but are sufficiently obscure we think that is unlikely to be relevant in practice - take the recent discussion of the copy_file_range emulation for older kernels, for example; that would be removing a feature "copy_file_range sometimes works without ENOSYS on older kernels". Because there is no total ordering for such features and no relation in general to particular old glibc versions, a minimum ABI can't really be defined in a way that could usefully change to reflect such slightly-incompatible changes. Fair enough. Here's another idea that occurred to me, that I'll throw out here. glibc could provide an API that lets you explicitly query whether the current glibc thinks it can run a binary built against a given version: bool gnu_get_libc_can_run (int build_major, int build_minor) { return (build_major < __GLIBC__ || (build_major == __GLIBC__ && build_minor <= __GLIBC_MINOR__); } If we had this we wouldn't even need a way to query __GLIBC__ and __GLIBC_MINOR__, because we're only querying them so we can implement this ourselves :-). Basically it would let us get rid of all this logic: https://github.com/pypa/pip/blob/5776ddd05896162e283737d7fcdf8f5a63a97bbc/src/pip/_internal/utils/glibc.py#L40-L62 It would also give the glibc devs full control over expressing whatever compatibility guidelines they want to commit to. (You'll notice that the code I linked to assumes that 3.x and 2.x are incompatible, which you're saying is wrong, so I guess we have a poor track record at reading the glibc devs' minds!) On Tue, 11 Jun 2019, njs at pobox dot com wrote: > Here's another idea that occurred to me, that I'll throw out here. glibc could > provide an API that lets you explicitly query whether the current glibc thinks > it can run a binary built against a given version: > > bool > gnu_get_libc_can_run (int build_major, int build_minor) > { > return (build_major < __GLIBC__ > || (build_major == __GLIBC__ && build_minor <= __GLIBC_MINOR__); > } Note that such logic is only valid on the assumption that both versions are using the same SONAME, and the same one of the ABIs listed at <https://sourceware.org/glibc/wiki/ABIList>. (Some ABI incompatibilities are checked for by glibc dynamic linker code, but not all.) For example, on Arm it would happily report being able to run binaries built with glibc 2.0, but any Arm binaries built with a version before 2.4 would be using the old ABI instead of EABI, and so certainly not able to run with current glibc (and for a while, both ABIs were supported before old-ABI support was removed). Or on x86 it would claim support for glibc 1.x binaries, which aren't compatible with 2.x (different SONAME, different dynamic linker, etc.). I'm not clear on the context in which you'd be calling such a function. Would it already be guaranteed that a case of non-matching SONAME or non-matching ABI either never reached this code, or is not something it needs to care about? Regarding SONAMEs: those are implied by the glibc version number, right? So I guess they should be handled by this function. We don't have a separate SONAME check, and would rather not add one. You're right though that I was sloppy about handling glibc 1.x. So I guess a better version would be: bool gnu_get_libc_can_run (int build_major, int build_minor) { if (build_major < 2) { /* glibc 1.x had a totally different ABI */ return false; } else { /* for glibc 2.x or later, the rule is simply * (build_major, build_minor) <= (runtime_major, runtime_minor) * where <= is tuple comparison. */ return (build_major < __GLIBC__ || (build_major == __GLIBC__ && build_minor <= __GLIBC_MINOR__); } } Regarding low-level ABI differences (architecture, calling convention, etc.): for the Python packaging case, our metadata has a platform ABI tag that we check separately, so we can assume that that's already been handled. That said, so far we've only supported x86 and x86-64, so there are probably some exciting surprises waiting for us as we start supporting architectures like ARM. Maybe we'll discover that glibc could do something to help here (maybe a gnu_get_libc_supported_abis, or something like that?). But I think we can treat that as an independent discussion. On Sun, 16 Jun 2019, njs at pobox dot com wrote: > Regarding SONAMEs: those are implied by the glibc version number, right? They're implied by the glibc ABI (from the list at <https://sourceware.org/glibc/wiki/ABIList>). Different glibc versions support different sets of ABIs. (Given that glibc 1.x used a disjoint set of ABIs and SONAMEs, it's thus questionable whether the function does need to handle it or not.) Ok, I guess I meant that *given a platform ABI*, the version number implies the soname? I don't really care that much about glibc 1.x honestly. Nobody is shipping glibc 1.x-based binaries or systems, and it doesn't affect my use cases at all. So you can handle it however you like. I'm not sure I understand what you're trying to figure out here. On Mon, 17 Jun 2019, njs at pobox dot com wrote: > Ok, I guess I meant that *given a platform ABI*, the version number implies the > soname? A platform ABI implies the SONAME. You don't need the version number. > I'm not sure I understand what you're trying to figure out here. Whether a simple version number comparison would actually address your problem (with all comparisons of the platform ABI - including all distinguishing of the different ABI variants listed at <https://sourceware.org/glibc/wiki/ABIList> - being the responsibility of something else, not the function in glibc). Right now our platform ABI tags on Linux are literally just the two strings "x86" and "x86_64". I guess glibc 1.x did support "x86", maybe with a slightly different calling convention? But that's not really important anymore. The interesting cases will be as we add more ARM support. Since we get to choose what tags we use, we can make our tags as fine-grained as necessary. So I'm guessing that yes, we can arrange things so that we do one ABI check using those tags, combine that with a version check using this function, and together that will take care of everything. But I'm definitely not an expert on the fine details of ARM ABIs. Do you foresee any problems if we split up responsibilities like that? On Mon, 17 Jun 2019, njs at pobox dot com wrote:
> Right now our platform ABI tags on Linux are literally just the two strings
> "x86" and "x86_64". I guess glibc 1.x did support "x86", maybe with a slightly
> different calling convention? But that's not really important anymore.
The glibc notion of different ABIs effectively treats that as being a
different platform (and, likewise, Arm old-ABI and EABI as different
platforms, MIPS classic-NaN and NaN2008 as different platforms, etc.).
Note, incidentally, that the choice between BE8 and BE32 for Arm
big-endian is a choice made when the static linker is run - code being
built for Arm big-endian can't tell when compiled which version of
big-endian will be chosen at link time.
OK, so it sounds like the answer is yes, a version-number-only function like this would be helpful, and we can take care of ABI differences separately. (The big thing about version numbers is that they change all the time, and we don't want to have to patch and deploy new packaging tools every time glibc makes a release. But if we have to patch and deploy new packaging tools to enable support for a new microarchitecture/calling convention/etc., that's fine.) I have reviewed this issue and it is my opinion that Joseph has answered all of Nathan's questions. When it comes to a mythical "glibc 3.0" which is more than likely based upon Florian Weimer's GNU Tools Cauldron talk "glibc 3.0": https://gcc.gnu.org/wiki/cauldron2017#glibc30 https://slideslive.com/38902629/glibc-30 The talk was about creating a lively discussion about features that might be deprecated. Please review the talk if you have questions about where the community might go. The kinds of deprecation that we're talking about for glibc 3.0 should not impact any future python application built today. I'm marking this issue RESOLVED/NOTABUG since the existing mechanisms that the Python community is using for Wheels should continue to work in the future as discussed in PEP-600 (https://www.python.org/dev/peps/pep-0600/). If a future glibc drops an old symbol it could mean that manylinux_2_5 built binary wheels could be incompatible with future modern distributions, and that is something that a compatibility resolver would need to know and just install manylinux_X_Y where X and Y are closer to a modern distribution. This hasn't happened yet, but when it does I'd expect what we deprecate is unused or untestable and so doesn't impact the existing wheels. |