9690 – glibc time functionality broken with kernel 2.6.26 and later

Bug 9690 - glibc time functionality broken with kernel 2.6.26 and later

Summary: glibc time functionality broken with kernel 2.6.26 and later

Status:	RESOLVED FIXED

Alias:	None

Product:	glibc
Classification:	Unclassified
Component:	libc (show other bugs)
Version:	2.8

Importance:	P2 normal
Target Milestone:	---
Assignee:	Ulrich Drepper

URL:
Keywords:

Depends on:
Blocks:

Reported:	2008-12-28 20:51 UTC by Hal V. Engel
Modified:	2014-07-02 07:39 UTC (History)
CC List:	5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Flags:	fweimer: security-

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Hal V. Engel 2008-12-28 20:51:19 UTC

Kernel 2.6.26 changed it's time functionality from being microsecond based
(microkernel) to being nanosecond based (nanokernel).  As a result there needs
to be changes made to sysdeps/unix/sysv/linux/sys/timex.h and also to
sysdeps/unix/sysv/linux/ntp_gettime.c to make glibc compatible with these newer
kernel versions.  If this is not done ntp will not function correctly.  Please
see http://sourceware.org/ml/libc-alpha/2008-03/msg00076.html for a set of
patches based on glibc 2.7 that will correct this problem. 

With the current code base ntp builds for a microkernel and it's functions to
set the clocks rate will be off by 3 orders of magnitude since the kernel is
passing and expecting to get nanosecond values.  This results in the clock being
unstable.  This is most noticeable when using a high quality reference clock
like a GPS where it should be possible to get near microsecond offsets with a
properly functioning clock but with the current glibc code it is more typical to
see around 100 microsecond offsets with the clock slewing between positive and
negative offsets and sometimes being off by as much as 500 microseconds. 

I have been using the patches from the above link for several months without
issue and these patches are being widely used by users of the LinuxPPS kernel
patch set.  The LinuxPPS patch set is now appearing in the daily mm kernel
snapshots and it is highly likely that kernel 2.6.29 will be shipped with the
LinuxPPS patches.  This will allow more users to attach reference clocks to
their machines and the issues with glibc and ntp and these newer kernels will
become painfully apparent when these users do not get the clock accuracy that
they were expecting.

Comment 1 Ulrich Drepper 2009-02-07 05:13:20 UTC

The patch changes the ABI.  The size of ntptimeval cannot be changed.  Just
imagine what happens if an old application is used with such a new glibc.  The
assignment to ->tai in ntp_gettime would corrupt memory.

You have to create a different data structure and a new version of the
ntp_gettime function.

Comment 2 Hal V. Engel 2009-02-07 06:31:18 UTC

I know it changes the ABI.  In any case the situation with current kernels,
glibc and ntp is clearly broken and should to be fixed.

The patch in question was submitted to the glibc email list almost a year ago
and the kernel changes that the patch was designed to address were released as a
stable kernel about 5 months ago.  The current timex.h header is based on linux
version 2.2 and is now out of sync with recent kernels.  

The patch is not mine and I was under the impression that it was submitted by a
glibc developer but I could be wrong.  In any case the person who submitted the
patch to the mailing list asked about how the ABI issue should be addressed and
it appears that no one replied to him with any suggestions about how to handle
this issue.

Also I did a search on my system to see what was using ntp_gettime and only
found two things - ntptime and libc.  So this call is not very widely used. 
However I can understand your concern over maintaining ABI compatibility as an
old ntp installation would be broken by a glibc with this patch applied.  

Prior to discovering this patch I and other LinuxPPS users had been using a
modified version of timex.h that had the new *NANO* declarations added but that
did not change the data structure or use ->tai and this seemed to work OK.   So
it might be possible to only use the part of the patch that affects timex.h
since this would avoid the ABI issue but would fix the NANO related issues that
ntp would otherwise have.

Comment 3 Samuel Thibault 2009-04-26 23:45:03 UTC

Please do not mix two things:  - the kernel now exposes nanoseconds instead of microseconds.  That's a   kernel ABI break.  It is announced via a STA_NANO flag in   timex.status, but still, old applications are broken when started   under kernels >= 2.6.26.  That's really a concern as it's not even   easy to notice while it can irritate users (unstable ntp time). - the kernel now exposes a new tai field.  That's not a kernel ABI   break as it just takes a reserved room.  To expose it to applications   we however need to change the userland ABI.  I'd really much rather see a kernel fix for the first issue: the kernel should report nanoseconds _only_ if userland requests it.  And the case of a new application running with an old kernel _has_ to be taken care of as well.  As for the second issue, see Ulrich's comment: just define a new version. See for instance the sched_setaffinity() function that has changed its ABI (and API too actually).

Comment 4 Samuel Thibault 2009-04-26 23:52:45 UTC

Subject: Re:  glibc time functionality broken with kernel 2.6.26 and later

Grah, I just hate these stupid web interfaces.  Hopefully this time it
doesn't thrashes my layout.

Please do not mix two things:

- the kernel now exposes nanoseconds instead of microseconds.  That's a
kernel ABI break.  It is announced via a STA_NANO flag in timex.status,
but still, old applications are broken when started under kernels >=
2.6.26.  That's really a concern as it's not even easy to notice while
it can irritate users (unstable ntp time).
- the kernel now exposes a new tai field.  That's not a kernel ABI break
as it just takes a reserved room.  To expose it to applications we
however need to change the userland ABI.

I'd really much rather see a kernel fix for the first issue: the kernel
should report nanoseconds _only_ if userland requests it.  And the case
of a new application running with an old kernel _has_ to be taken care
of as well.

As for the second issue, see Ulrich's comment: just define a new
version. See for instance the sched_setaffinity() function that has
changed its ABI (and API too actually).

Comment 5 John Stultz 2009-05-07 21:28:23 UTC

(In reply to comment #4)
> Please do not mix two things:
> 
> - the kernel now exposes nanoseconds instead of microseconds.  That's a
> kernel ABI break.  It is announced via a STA_NANO flag in timex.status,
> but still, old applications are broken when started under kernels >=
> 2.6.26.  That's really a concern as it's not even easy to notice while
> it can irritate users (unstable ntp time).

I'm not sure this is true. The kernel internally multiplies microseconds up to
nanoseconds if the STA_NANO bit is not set. So old applications should behave
properly.

> - the kernel now exposes a new tai field.  That's not a kernel ABI break
> as it just takes a reserved room.  To expose it to applications we
> however need to change the userland ABI.

This is my understanding as well.

> I'd really much rather see a kernel fix for the first issue: the kernel
> should report nanoseconds _only_ if userland requests it.  And the case
> of a new application running with an old kernel _has_ to be taken care
> of as well.

Please bring this up on lkml and CC me if you have evidence of problems here.
I'll be happy to look at it.

> As for the second issue, see Ulrich's comment: just define a new
> version. See for instance the sched_setaffinity() function that has
> changed its ABI (and API too actually).

Do we know if anyone is still working this? Roman's patch was seemingly ignored
with no feedback. Additionally he's not been around much recently, so I'm not
sure if he will be following up with fixes.

Comment 6 Samuel Thibault 2009-05-07 22:59:27 UTC

Subject: Re:  glibc time functionality broken with kernel 2.6.26 and later

johnstul at us dot ibm dot com, le Thu 07 May 2009 21:28:25 -0000, a écrit :
> > - the kernel now exposes nanoseconds instead of microseconds.  That's a
> > kernel ABI break.  It is announced via a STA_NANO flag in timex.status,
> > but still, old applications are broken when started under kernels >=
> > 2.6.26.  That's really a concern as it's not even easy to notice while
> > it can irritate users (unstable ntp time).
> 
> I'm not sure this is true. The kernel internally multiplies microseconds up to
> nanoseconds if the STA_NANO bit is not set. So old applications should behave
> properly.

Again, there are two issues:
- What the kernel takes as parameter. As you say, there is no problem
  indeed, if the application hasn't set the STA_NANO flag, the kernel
  converts properly.
- What the kernel returns. nanoseconds values are advertised by the
  kernel through the STA_NANO flag. But old applications didn't even
  know that flag, and thus can not know that these are nanosecond
  values.

> > As for the second issue, see Ulrich's comment: just define a new
> > version. See for instance the sched_setaffinity() function that has
> > changed its ABI (and API too actually).
> 
> Do we know if anyone is still working this? Roman's patch was seemingly ignored
> with no feedback.

There was: "define a new version to avoid breaking the ABI".

Samuel

Comment 7 Hal V. Engel 2009-05-07 23:10:40 UTC

(In reply to comment #5)
> (In reply to comment #4)
> > Please do not mix two things:
> > 
> > - the kernel now exposes nanoseconds instead of microseconds.  That's a
> > kernel ABI break.  It is announced via a STA_NANO flag in timex.status,
> > but still, old applications are broken when started under kernels >=
> > 2.6.26.  That's really a concern as it's not even easy to notice while
> > it can irritate users (unstable ntp time).
> 
> I'm not sure this is true. The kernel internally multiplies microseconds up to
> nanoseconds if the STA_NANO bit is not set. So old applications should behave
> properly.

The issue is that when ntp builds it looks in sys/timex.h (IE. the glibc
timex.h) and if it does not see STA_NANO and friends it builds as a microsecond
only app with the assumption that it is going to be running against a
microkernel.  When this happens it never asks the kernel if it is a nanokernel
and this results in things not working correctly.  

What LinuxPPS users noticed was that using the combination of a non-nano aware
ntp with the newer nanokernels resulted in time keeping that was, by our
standards and expectations, unstable and we would see the offsets slewing
through a +-500 microsecond range.  Tis was almost two orders of magnitude more
than we were seeing before the switch to the nanokernel.  There was a long
string of emails on the linuxpps email list about this and it almost goes
without saying that we were not happy campers.  We were clueless about what the
cause was and it took a while for us to figure out that the missing STA_NANO &
friends stuff in sys/timex.h was a big part of what we were seeing.  We
discovered this because we have some users who also use FreeBSD and they were
able to point out that for some reason ntp was not detecting the nanokernel like
it should. 

As soon as we added the STA_NANO & friends declarations to sys/timex.h and
rebuilt ntp these issues were greatly improved and offsets were reduced by an
order of magnitude.  The offsets were still higher than before the nanokernel so
there were other issues as well but this was a very big one.  I think John's new
convergence kernel patch may be the other piece of this puzzle.   So I don't
think it is that the kernel is giving out the time in a different format (in
fact it is not) but rather it has something to do with how ntp interacts with
the kernel to make frequency adjustments to the clock (that my guess anyway). 
It clearly does something different if it is in microseond mode than it does in
nanosecond mode. 


I should add that this appears to be a Linux only problem and ntp is not having
these issues on any other OS with a nanokernel.

Comment 8 John Stultz 2009-05-07 23:12:34 UTC

(In reply to comment #6)
> Again, there are two issues:
> - What the kernel takes as parameter. As you say, there is no problem
>   indeed, if the application hasn't set the STA_NANO flag, the kernel
>   converts properly.
> - What the kernel returns. nanoseconds values are advertised by the
>   kernel through the STA_NANO flag. But old applications didn't even
>   know that flag, and thus can not know that these are nanosecond
>   values.

So the concern is only with running old applications after a new application has
set the STA_NANO flag?

Comment 9 Ulrich Drepper 2009-05-08 21:06:50 UTC

I have added the tai field and various macros to the header on April 21st. 
These changes match the kernel changes.  If this is not correct bring it up with
the kernel people.

Comment 10 Hal V. Engel 2009-05-08 22:14:55 UTC

Thank you for getting these changes in.