Kernel 2.6.26 changed it's time functionality from being microsecond based (microkernel) to being nanosecond based (nanokernel). As a result there needs to be changes made to sysdeps/unix/sysv/linux/sys/timex.h and also to sysdeps/unix/sysv/linux/ntp_gettime.c to make glibc compatible with these newer kernel versions. If this is not done ntp will not function correctly. Please see http://sourceware.org/ml/libc-alpha/2008-03/msg00076.html for a set of patches based on glibc 2.7 that will correct this problem. With the current code base ntp builds for a microkernel and it's functions to set the clocks rate will be off by 3 orders of magnitude since the kernel is passing and expecting to get nanosecond values. This results in the clock being unstable. This is most noticeable when using a high quality reference clock like a GPS where it should be possible to get near microsecond offsets with a properly functioning clock but with the current glibc code it is more typical to see around 100 microsecond offsets with the clock slewing between positive and negative offsets and sometimes being off by as much as 500 microseconds. I have been using the patches from the above link for several months without issue and these patches are being widely used by users of the LinuxPPS kernel patch set. The LinuxPPS patch set is now appearing in the daily mm kernel snapshots and it is highly likely that kernel 2.6.29 will be shipped with the LinuxPPS patches. This will allow more users to attach reference clocks to their machines and the issues with glibc and ntp and these newer kernels will become painfully apparent when these users do not get the clock accuracy that they were expecting.
The patch changes the ABI. The size of ntptimeval cannot be changed. Just imagine what happens if an old application is used with such a new glibc. The assignment to ->tai in ntp_gettime would corrupt memory. You have to create a different data structure and a new version of the ntp_gettime function.
I know it changes the ABI. In any case the situation with current kernels, glibc and ntp is clearly broken and should to be fixed. The patch in question was submitted to the glibc email list almost a year ago and the kernel changes that the patch was designed to address were released as a stable kernel about 5 months ago. The current timex.h header is based on linux version 2.2 and is now out of sync with recent kernels. The patch is not mine and I was under the impression that it was submitted by a glibc developer but I could be wrong. In any case the person who submitted the patch to the mailing list asked about how the ABI issue should be addressed and it appears that no one replied to him with any suggestions about how to handle this issue. Also I did a search on my system to see what was using ntp_gettime and only found two things - ntptime and libc. So this call is not very widely used. However I can understand your concern over maintaining ABI compatibility as an old ntp installation would be broken by a glibc with this patch applied. Prior to discovering this patch I and other LinuxPPS users had been using a modified version of timex.h that had the new *NANO* declarations added but that did not change the data structure or use ->tai and this seemed to work OK. So it might be possible to only use the part of the patch that affects timex.h since this would avoid the ABI issue but would fix the NANO related issues that ntp would otherwise have.
Please do not mix two things: - the kernel now exposes nanoseconds instead of microseconds. That's a kernel ABI break. It is announced via a STA_NANO flag in timex.status, but still, old applications are broken when started under kernels >= 2.6.26. That's really a concern as it's not even easy to notice while it can irritate users (unstable ntp time). - the kernel now exposes a new tai field. That's not a kernel ABI break as it just takes a reserved room. To expose it to applications we however need to change the userland ABI. I'd really much rather see a kernel fix for the first issue: the kernel should report nanoseconds _only_ if userland requests it. And the case of a new application running with an old kernel _has_ to be taken care of as well. As for the second issue, see Ulrich's comment: just define a new version. See for instance the sched_setaffinity() function that has changed its ABI (and API too actually).
Subject: Re: glibc time functionality broken with kernel 2.6.26 and later Grah, I just hate these stupid web interfaces. Hopefully this time it doesn't thrashes my layout. Please do not mix two things: - the kernel now exposes nanoseconds instead of microseconds. That's a kernel ABI break. It is announced via a STA_NANO flag in timex.status, but still, old applications are broken when started under kernels >= 2.6.26. That's really a concern as it's not even easy to notice while it can irritate users (unstable ntp time). - the kernel now exposes a new tai field. That's not a kernel ABI break as it just takes a reserved room. To expose it to applications we however need to change the userland ABI. I'd really much rather see a kernel fix for the first issue: the kernel should report nanoseconds _only_ if userland requests it. And the case of a new application running with an old kernel _has_ to be taken care of as well. As for the second issue, see Ulrich's comment: just define a new version. See for instance the sched_setaffinity() function that has changed its ABI (and API too actually).
(In reply to comment #4) > Please do not mix two things: > > - the kernel now exposes nanoseconds instead of microseconds. That's a > kernel ABI break. It is announced via a STA_NANO flag in timex.status, > but still, old applications are broken when started under kernels >= > 2.6.26. That's really a concern as it's not even easy to notice while > it can irritate users (unstable ntp time). I'm not sure this is true. The kernel internally multiplies microseconds up to nanoseconds if the STA_NANO bit is not set. So old applications should behave properly. > - the kernel now exposes a new tai field. That's not a kernel ABI break > as it just takes a reserved room. To expose it to applications we > however need to change the userland ABI. This is my understanding as well. > I'd really much rather see a kernel fix for the first issue: the kernel > should report nanoseconds _only_ if userland requests it. And the case > of a new application running with an old kernel _has_ to be taken care > of as well. Please bring this up on lkml and CC me if you have evidence of problems here. I'll be happy to look at it. > As for the second issue, see Ulrich's comment: just define a new > version. See for instance the sched_setaffinity() function that has > changed its ABI (and API too actually). Do we know if anyone is still working this? Roman's patch was seemingly ignored with no feedback. Additionally he's not been around much recently, so I'm not sure if he will be following up with fixes.
Subject: Re: glibc time functionality broken with kernel 2.6.26 and later johnstul at us dot ibm dot com, le Thu 07 May 2009 21:28:25 -0000, a écrit : > > - the kernel now exposes nanoseconds instead of microseconds. That's a > > kernel ABI break. It is announced via a STA_NANO flag in timex.status, > > but still, old applications are broken when started under kernels >= > > 2.6.26. That's really a concern as it's not even easy to notice while > > it can irritate users (unstable ntp time). > > I'm not sure this is true. The kernel internally multiplies microseconds up to > nanoseconds if the STA_NANO bit is not set. So old applications should behave > properly. Again, there are two issues: - What the kernel takes as parameter. As you say, there is no problem indeed, if the application hasn't set the STA_NANO flag, the kernel converts properly. - What the kernel returns. nanoseconds values are advertised by the kernel through the STA_NANO flag. But old applications didn't even know that flag, and thus can not know that these are nanosecond values. > > As for the second issue, see Ulrich's comment: just define a new > > version. See for instance the sched_setaffinity() function that has > > changed its ABI (and API too actually). > > Do we know if anyone is still working this? Roman's patch was seemingly ignored > with no feedback. There was: "define a new version to avoid breaking the ABI". Samuel
(In reply to comment #5) > (In reply to comment #4) > > Please do not mix two things: > > > > - the kernel now exposes nanoseconds instead of microseconds. That's a > > kernel ABI break. It is announced via a STA_NANO flag in timex.status, > > but still, old applications are broken when started under kernels >= > > 2.6.26. That's really a concern as it's not even easy to notice while > > it can irritate users (unstable ntp time). > > I'm not sure this is true. The kernel internally multiplies microseconds up to > nanoseconds if the STA_NANO bit is not set. So old applications should behave > properly. The issue is that when ntp builds it looks in sys/timex.h (IE. the glibc timex.h) and if it does not see STA_NANO and friends it builds as a microsecond only app with the assumption that it is going to be running against a microkernel. When this happens it never asks the kernel if it is a nanokernel and this results in things not working correctly. What LinuxPPS users noticed was that using the combination of a non-nano aware ntp with the newer nanokernels resulted in time keeping that was, by our standards and expectations, unstable and we would see the offsets slewing through a +-500 microsecond range. Tis was almost two orders of magnitude more than we were seeing before the switch to the nanokernel. There was a long string of emails on the linuxpps email list about this and it almost goes without saying that we were not happy campers. We were clueless about what the cause was and it took a while for us to figure out that the missing STA_NANO & friends stuff in sys/timex.h was a big part of what we were seeing. We discovered this because we have some users who also use FreeBSD and they were able to point out that for some reason ntp was not detecting the nanokernel like it should. As soon as we added the STA_NANO & friends declarations to sys/timex.h and rebuilt ntp these issues were greatly improved and offsets were reduced by an order of magnitude. The offsets were still higher than before the nanokernel so there were other issues as well but this was a very big one. I think John's new convergence kernel patch may be the other piece of this puzzle. So I don't think it is that the kernel is giving out the time in a different format (in fact it is not) but rather it has something to do with how ntp interacts with the kernel to make frequency adjustments to the clock (that my guess anyway). It clearly does something different if it is in microseond mode than it does in nanosecond mode. I should add that this appears to be a Linux only problem and ntp is not having these issues on any other OS with a nanokernel.
(In reply to comment #6) > Again, there are two issues: > - What the kernel takes as parameter. As you say, there is no problem > indeed, if the application hasn't set the STA_NANO flag, the kernel > converts properly. > - What the kernel returns. nanoseconds values are advertised by the > kernel through the STA_NANO flag. But old applications didn't even > know that flag, and thus can not know that these are nanosecond > values. So the concern is only with running old applications after a new application has set the STA_NANO flag?
I have added the tai field and various macros to the header on April 21st. These changes match the kernel changes. If this is not correct bring it up with the kernel people.
Thank you for getting these changes in.