DRAFT: Y2038 Proofness Design
Contents
History
Revision 24 was reviewed on the libc-alpha mailing list as 1st draft: https://www.sourceware.org/ml/libc-alpha/2015-10/msg00893.html
Revision 55 was reviewed on the libc-alpha mailing list as 2nd draft: https://www.sourceware.org/ml/libc-alpha/2016-01/msg00832.html
Revision 63 was reviewed on the libc-alpha mailing list as 3rd draft: https://www.sourceware.org/ml/libc-alpha/2016-06/msg00243.html
Revision 83 was reviewed on the libc-alpha mailing list as 4th draft: https://www.sourceware.org/ml/libc-alpha/2016-06/msg00824.html
Revision 115 was reviewed on the libc-alpha mailing list as 5th draft: https://www.sourceware.org/ml/libc-alpha/2017-02/msg00399.html
Revision 146 was posted on the libc-alpha mailing list as 6th draft: https://www.sourceware.org/ml/libc-alpha/2017-04/msg00419.html -- no comments were made
Scope
The intent of this page is to serve as a central point for describing the Y2038 'proofness' design.
Y2038 'proofness' means that application calls to glibc-provided function should never return wrong results when UTC times outside -231..231-1 seconds from the Unix Epoch are involved.
This document is only about Y2038 (and Y1901) and is not about any other time boundaries such as Y2106 (unsigned 32-bit Epoch-based times), Y2036 (unsigned 32-bit 1900-based RFC 868 times) or Y9999 (four-digit years).
For example, asctime_r in practice might possibly not be Y9999-proof (for instance, it might overrun a too-small output buffer). However, this document is not about Y9999; it is about Y2038, for which asctime_r is safe since it does not handle Epoch-based 32-bit signed values.
Useful Definitions
32-bit time denotes a representation of time as a 32-bit signed quantity of seconds relative to the Unix Epoch of 1970-01-01 00:00:00 UTC.
64-bit time denotes a representation of time as a 64-bit signed quantity of seconds relative to the Unix Epoch of 1970-01-01 00:00:00 UTC.
Y2038 denotes the problem where a 32-bit time value rolls over beyond +2147483647 into negative values.
- This will happen to the system clock time at 2038-01-19 03:14:07 UTC (possibly throwing it back to 1901-12-13 20:45:52 UTC).
This will also happen if, on 2038-01-16 12:00:00 UTC, the command at now + 5 days is executed.
- etc.
Y2038-incompatible means 'unable to hold (for types) or handle (for APIs) values representing post-Y2038 dates.
Y2038-compatible means 'able to hold (for types) or handle (for APIs) values representing post-Y2038 dates.
Goals
- Provide Y2038-compatible 64-bit time support to 32-bit.
- Keep providing legacy Y2038-incompatible 32-bit time support to 32-bit.
Constraints
There are a number of constraints which dictate the direction of the design. They are either definite or debatable. Debatable design preclusions should be finalized before this design document leaves DRAFT.
Definite
64-bit time would be supported as a replacement of 32-bit time; user code (including libraries) would require either 32-bit or 64-bit time support, with 32-bit remaining the default for now.
- Existing 32-bit API *symbols* would remain unmodified so that existing binary code would link properly. New 32-bit symbols would be named after their corresponding 32-bit symbol plus "64",
e.g. the 64-bit implementation of clock_gettime would be named clock_gettime64.
To ensure the same source code would compile properly for either 32- or 64-bit time, when defaulting to 64-bit time support, the 32-bit API and type names would be #define'ed to their 64-bit counterparts,
e.g. #define clock_gettime clock_gettime64.
Any newly introduced function MUST be declared in the implementation, not user, namespace (see https://sourceware.org/bugzilla/show_bug.cgi?id=14106).
Any newly introduced struct MUST have a name (see https://sourceware.org/bugzilla/show_bug.cgi?id=15766).
Debatable
- 64-bit time should only be supported if 64-bit file offsets are. This sounds like something totally unrelated, but the underlying idea is to not multiply the possible combinations.
- fixing 32-bit time support is not a goal; 32-bit time functions which returns wrong results now will keep returning wrong results after introduction of 64-bit time support.
Make sure no change affects 64-bit platforms where time_t is already 64-bit (see http://www.sourceware.org/ml/libc-alpha/2015-08/msg00038.html).
- No security fixes to existing 32-bit time code in this feature development:
It can be argued that overflowing 32-bit time could be used as an attack vector against 32-bit applications. Such applications running on 64-bit kernels could be improved to detect overflow conditions. For example the stat64 kernel syscall could be enhanced to return EOVERFLOW when time conversion overflows (Bug 1419736: 32-bit stat returns wrong st_mtime if file timestamp does not fit in 32 bits)
- Fixes need not be all-or-nothing and each API could be taken on a case-by-case basis and fixed as best as possible given the interfaces. The previous example of stat64 fixes in the Linux kernel can be used to fix 32-bit stat when called while running on a 64-bit kernel. Running 32-bit on 64-bit kernel is quickly becoming the most common form of running 32-bit applications (ignoring IoT applications which will need 64-bit time).
- However, security fixes in general should not be part of a feature development, but rather should be fast-tracked outside of any long-term project.
But of course, any security flaw uncovered during implementation of Y2038 support will be raised at once.
Analysis
This section details the problems with Y2038 and glibc.
API
The glibc API provides many types and functions which involve time. Among these:
some are already Y2038-compatible, and do not need to be modified in order to handle dates beyond Y2038. An example of such an already Y2038-compatible API is char *asctime(const struct tm *tm).
some others are Y2038-incompatible: they need to be modified in order to handle dates beyond Y2038. An example of such a Y2038-incompatible API is time_t mktime(struct tm *tm).
- some others yet do not need modifications but their internal implementation many need it; see below.
This sections describes the Y2038-incompatible glibc APIs, which need changes in order to become Y2038-compatible.
API types
These are types whose values include a date and which cannot properly denote dates past Y2038.
The essential example is time_t, which is a 32-bit signed integer count of seconds since the Epoch; its maximum value is Y2038.
By extension, any struct containing at least one time_t is not Y2038-compatible.
non-Y2038-compatible types |
Types containing an Epoch-based value |
time_t |
Types using time_t |
struct lastlog |
struct msqid_ds |
struct semid_ds |
struct timeb -- obsolete |
struct timespec |
struct timeval |
struct utimbuf |
Types using struct timespec |
struct itimerspec |
struct stat |
Types using struct timeval |
struct clnt_ops |
struct elf_prstatus |
struct itimerval |
struct ntptimeval |
struct rusage |
struct timex |
struct utmp |
struct utmpx |
API functions
API function which do not involve any Y2038-incompatible type are Y2038-compatible, whereas API functions which involve at least one Y2038-incompatible type are Y2038-incompatible.
This applies to API functions which involve a pointer to a Y2038-incompatible type.
Below is the list of Y2038-incompatible API functions, based on the list of Y2038-incompatible types.
Y2038-incompatible APIs |
APIs using time_t |
char * ctime (const time_t *) |
char * ctime_r (const time_t *, char *) |
double difftime (time_t, time_t) |
struct tm * gmtime (const time_t *) |
struct tm * gmtime_r (const time_t *, struct tm *) |
struct tm * localtime (const time_t *) |
struct tm * localtime_r (const time_t *, struct tm *) |
time_t mktime (struct tm *) |
int stime (const time_t *) |
time_t timegm (struct tm *) |
time_t timelocal (struct tm *) |
time_t time (time_t *) |
APIs using struct lastlog |
(none) |
APIs using struct msqid_ds |
int msgctl(int, int, struct msqid_ds *) |
APIs using struct semid_ds |
(none) |
APIs using struct timeb |
int ftime(struct timeb *) -- obsolete |
APIs using struct timespec |
int aio_suspend (const struct aiocb * const *, int, const struct timespec *) |
int clock_getres (clockid_t, struct timespec *) |
int clock_gettime (clockid_t, struct timespec *) |
int clock_nanosleep (clockid_t, int, const struct timespec *, struct timespec *) |
int clock_settime (clockid_t, const struct timespec *) |
int futimens (int, const struct timespec *) |
ssize_t mq_timedreceive (mqd_t, char *, int, unsigned int *, const struct timespec *) |
int mq_timedsend (mqd_t, const char *, int, unsigned int, const struct timespec *) |
int nanosleep (const struct timespec *, struct timespec *) |
int pselect (int, fd_set *, fd_set *, fd_set *, const struct timespec *, const sigset_t *) |
int pthread_cond_timedwait (pthread_cond_t *, pthread_mutex_t *, const struct timespec *) |
int pthread_mutex_timedlock (pthread_mutex_t *, const struct timespec *) |
int pthread_rwlock_timedrdlock (pthread_rwlock_t *, const struct timespec *) |
int pthread_rwlock_timedwrlock (pthread_rwlock_t *, const struct timespec *) |
int sched_rr_get_interval (pid_t, struct timespec *) |
int sem_timedwait (sem_t *, const struct timespec *) |
int sigtimedwait (const sigset_t *, siginfo_t *, const struct timespec *) |
int timespec_get (struct timespec *, int) |
int utimensat (int, const char *, const struct timespec *, int) |
APIs using struct timeval |
int adjtime (const struct timeval *, struct timeval *) |
enum clnt_stat pmap_rmtcall (struct sockaddr_in *, const u_long, const u_long, const u_long, xdrproc_t, caddr_t, xdrproc_t, caddr_t, struct timeval, u_long *) |
CLIENT * clntudp_bufcreate (struct sockaddr_in *, u_long, u_long, struct timeval, int *, u_int, u_int) |
CLIENT * clntudp_create (struct sockaddr_in *, u_long, u_long, struct timeval, int *) |
int futimes (int, const struct timeval *) |
int gettimeofday (struct timeval *, struct timezone_ptr_t) |
int lutimes (const char *, const struct timeval *) |
int select (int, fd_set *, fd_set *, fd_set *, struct timeval *) |
int settimeofday (const struct timeval *, const struct timezone *) |
int utimes (const char *, const struct timeval *) |
APIs using struct utimbuf |
int utime(const char *, const struct utimbuf *) |
APIs using struct itimerspec |
int timerfd_gettime(int, struct itimerspec *) |
int timerfd_settime(int, int, const struct itimerspec *, struct itimerspec *) |
int timer_gettime(timer_t, struct itimerspec *) |
int timer_settime(timer_t, int, const struct itimerspec *, struct itimerspec *) |
APIs using struct stat |
int fstatat(int, const char *, struct stat *, int) |
int fstat(int, struct stat *) |
int lstat(const char *, struct stat *) |
int stat(const char *, struct stat *) |
APIs using struct clnt_ops |
(none) |
APIs using struct elf_prstatus |
(none) |
APIs using struct itimerval |
struct itimerval |
int setitimer (int which, const struct itimerval *new, struct itimerval *old) |
int getitimer (int which, struct itimerval *) |
APIs using struct ntptimeval |
int ntp_gettime(struct ntptimeval *) |
int ntp_gettimex(struct ntptimeval *) |
APIs using struct rusage |
int getrusage(int who, struct rusage *) |
pid_t wait3(int *, int, struct rusage *) -- obsolete |
pid_t wait4(pid_t, int *, int, struct rusage *) -- obsolete |
APIs using struct timex |
int adjtimex (struct timex *timex) |
int ntp_adjtime (struct timex *) |
APIs using struct utmp |
int getutent_r(struct utmp *, struct utmp **) |
struct utmp * getutent() |
int getutid_r(const struct utmp *, struct utmp *, struct utmp **) |
struct utmp * getutid(const struct utmp *) |
int getutline_r(const struct utmp *, struct utmp *, struct utmp **) |
struct utmp * getutline(const struct utmp *) |
void login(const struct utmp *) |
struct utmp * pututline(const struct utmp *) |
void updwtmp(const char *, const struct utmp *) |
APIs using struct utmpx |
struct utmpx * getutxent() |
struct utmpx * getutxid(const struct utmpx *) |
struct utmpx * getutxline(const struct utmpx *) |
struct utmpx * pututxline(const struct utmpx *) |
API socket options
Socket timestamping
Like ioctl(), setsockopt()/getsockopt() has a few interfaces that are passing time data:
SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING enable socket timestamping, which allows socket messages to be augmented with timestamps through cmsgs. There are two aspects involved:
The setsockopt() syscall, where timestamping options are modified for a given socket, and
The recvmsg() / sendmsg() syscalls through which actual time-related ancillary data (cmsg_data) is retrieved or provided with a given packet.
Regarding the setsockopt() syscall, SO_TIMESTAMP and SO_TIMESTAMPNS use a boolean argument, and SO_TIMESTAMPING uses an integer bitfield, none of which are sensitive to time bit-size.
On th other hand, regarding the recvmsg() / sendmsg() cmsg_data:
SO_TIMESTAMP uses a struct timeval;
SO_TIMESTAMPNS uses a struct timespec;
SO_TIMESTAMPING uses a struct scm_timestamping which contains three struct timespec fields.
The Linux kernel decides whether to use 32- or 64-bit-time variants of struct timeval, struct timespec and struct scm_timestamping based on the actual value of SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING.
To prevent incompatibilities with existing 32-bit-time code, the existing socket options would remain in place and new options SO_TIMESTAMP64, SO_TIMESTAMPNS64 and SO_TIMESTAMPING64 would be added which would use Y2038-compatible timestamps.
The GLIBC API would use either the old Y2038-incompatible set of socket options or the new Y2038-compatible set depending on __USE_TIME_BITS64, and provide the correct 'struct timeval' and 'struct timespec' so that application source code works for both 32 and 64 bits time size.
Socket time-outs
SO_RCVTIMEO and SO_SNTTIMEO are socket options for controlling socket time-outs. They are passed along with a pointer to a struct timeval another argument specifying the struct's length.
This length could be used to determine whether a Y2038-incompatible and Y2038-compatible struct timeval was passed, but there is no way to distinguish a struct timeval by size between a true 64-bit kernel and a 32-bit kernel with 64-bit time support. To avoid risks, these options would be handled the same way as SO_TIMESTAMP*, using new numbers for the flags.
NOTE: at the moment, GLIBC defines absolutely no SO_TIMESTAMP* macros. To introduce time-size-insensitive versions discussed above, these would have to be introduced in bits/sockets.h.
IOCTLs
Some Linux IOCTLs may be Y2038-incompatible, or use types defined by glibc that do not match the kernel internal types. Known important cases are:
An ioctl command number is defined using the _IOR/_IOW/_IORW macros by the kernel with a structure whose size changes based on glibc's time_t.
- The kernel can handle these transparently by implementing handlers for both command numbers with the correct structure format. glibc can keep a single implementation for both 32-bit and 64-bit time cases.
The ABI changes based on the glibc time_t size, but the command number does not change.
In such cases, a new command number should be defined when __USE_TIME_BITS64 is set, and the situation becomes similar to the previous case.
An ioctl command passes time information in a structure that is not based on time_t but another integer type that does not get changed.
- [AA - which IOCTLs would that be?]
The kernel header files will provide both a new structure layout and command number when __USE_TIME_BITS64 is set.
- [AA - which IOCTLs would that be?]
GLIBC itself uses some ioctls (which can be found by running git grep 'ioctl *(' -- "*.c" | sed -e 's/^.*ioctl *([^,]*, *\([^,)]*\)[,)].*$/\1/' on the GLIBC source repository, minus some false positives). None seem to be time-sensitive.
Time-related but Y2038-compatible GLIBC API types and functions
There are parts of the GLIBC API which are time-related but are Y2038-compatible. This sections lists them along with notes explaining why they need no fix for Y2038.
Time-related but Y2038-compatible API types |
Notes |
struct rpc_timeval |
Has a 32-bit but unsigned epoch-based field (1) |
struct tm |
Does not have any epoch-related field |
struct tms |
Contains ticks, not epoch-based seconds |
struct timezone |
Contains a time zone offset in minutes |
- Hence formally Y2038-compatible, but Y2036-incompatible.
Time-related but Y2038-compatible API functions |
APIs using struct rpc_timeval |
int rtime(struct sockaddr_in *, struct rpc_timeval *, struct rpc_timeval *) |
APIs using struct tm |
char * asctime (const struct tm *) |
char * asctime_r (const struct tm *, char *) |
size_t strftime (char *, size_t, const char *, const struct tm *) |
size_t wcsftime (wchar_t *, size_t, const wchar_t *, const struct tm *) |
char * strptime (const char *, const char *, struct tm *) |
struct tm * getdate (const char *) |
int getdate_r (const char *, struct tm *) |
APIs using struct sysinfo |
int sysinfo (struct sysinfo *) |
APIs using struct tms |
clock_t times(struct tms *) |
Others |
unsigned int alarm (unsigned int) -- Strict Posix limits argument to UINT_MAX |
unsigned int sleep (unsigned int) -- Strict Posix limits argument to UINT_MAX |
clock_t clock(void) |
implementation
A strong constraint is support for older, already built, binaries, which means the current implementations of 32-bit time Y2038-sensitive APIs must remain unchanged.
However, these implementation will be incompatible with 64-bit time.
Support for 64-bit will have to be provided by new implementations which will use new 64-bit time types.
Design
This section describes the design proposition for making GLIBC Y2038-compatible.
API design
In order to avoid duplicating APIs for 32-bit and 64-bit time, glibc will provide either one but not both for a given application; the application code will have to choose between 32-bit or 64-bit time support, and the same set of symbols (e.g. time_t or clock_gettime) will be provided in both cases.
The following is proposed:
User code defines _TIME_BITS=64 to get 64-bit time support instead of the legacy 32-bit time.
If glibc sees _TIME_BITS=64, then it defines __USE_TIME_BITS64 to indicate that time support is 64-bit rather than 32-bit.
This allows user code to check, by verifying whether __USE_TIME_BITS64 is defined once glibc headers are #included, that they are using a glibc which actually supports 64-bit time (or claims to).
On 64-bit systems, only 64-bit time is supported, __USE_TIME_BITS64 is defined regardless of _TIME_BITS, and the same symbols are used as they were in 32-bit time.
On 32-bit systems, if __USE_TIME_BITS64 is defined, time support is provided for 64-bit time; otherwise, it is provided for 32-bit time.
For instance, providing 64-bit time support for the time() function would result in the following:
on a 64-bit system, glibc would provide an already Y2038-compatible time_t and a Y2038-compatible time_t time (time_t *result)
on a 32-bit system with 32-bit time support glibc would continue providing the existing Y2038-incompatible 32-bit time_t and time_t time (time_t *result)
on a 32-bit system with 64-bit time support glibc would provide a Y2038-compatible 64-bit time type __time64_t and a Y2038-compatible __time64_t __time64 (__time64_t *result) (1)
and the header file would define time_t as __time64_t and redirect time to __time64.
In other words, which of the 32-bit or 64-bit time implementations of an API gets called will depend on __USE_TIME_BITS64: if __USE_TIME_BITS64 is undefined, then GLIBC API header files provide the existing 32-bit time-related types and function names, whereas if __USE_TIME_BITS64 is defined, then the header files define 64-bit time types and redirect the APIs to the new implementations.
(1) BUT time() and stime() might still be unable to handle time_t values past Y2038.
API types
This sections list examples of Y2038-compatible types with their rationale
time_t
time_t is a 32-bit signed Epoch-related count of seconds, and is Y2038-incompatible because Y2038 is the maximum positive value of time_t.
Y2038-compatible APIs will need a type which can hold Epoch-related counts of seconds beyond Y2038; the most sensible choice for this type is 64-bit signed integer, because:
- 64-bit Epoch-based time in seconds is already what 64-bit systems use, and
- all valid 32-bit Epoch-based time values are also valid once sign-extended to 64-bit and have the same semantics.
Note that Posix currently does not mandate any specific time representation for handling the Y2038 problem, and while it considers a 64-bit time_t approach, it also considers just mandating that a future date be representable, leaving implementers free to choose the details of said representation.
struct timespec
struct timespec contains two fields, a time_t called tv_sec and a long called tv_nsec
For 64-bit time, the time_t should be replaced by a __time64_t in order for post-Y2038 dates to be represented.
On the other hand, on 64-bit systems, both fields are 64 bits (which makes sense as a long is 64 bit on these systems), and it would make sense to use the same physical layout for 32-bit systems handling 64-bit time.
The Posix option(s)
The long type of tv_nsec is mandated by Posix. Also, even for 64-bit time, the number of nanoseconds within a second does not need more than 32 bits of representation, and could remain unchanged. Therefore, there are arguments for keeping tv_nsec a long even for 64-bit time.
Care must be taken that much user code which initializes struct timespec variables assumes the tv_sec and tv_nsec fields are declared respectively first and second, e.g.
struct timespec one_second_and_a_half = { 1, 500000000 };
This means that in the API, any intermediate padding should be ignored as far as initializers go, which can be achieved with anonymous bitfields; in the implementation, the same padding will need a name in order to be accessible for extension from 32 to 64 bits or reduction from 64 to 32 bits. Therefore a possible implementation would be :
For little-endian systems (where padding would not occur between the fields), for both API and implementation:
struct __timespec64 { __time64_t tv_sec; long tv_nsec; long padding; /* or an anonymous bitfield */ };
For big-endian systems (where padding would occur between the fields) :
in API:
struct __timespec64 { __time64_t tv_sec; int : 32; /* anonymous bitfield required for '= { tv_sec,tv_nsec}' initializers to work */ long tv_nsec; };
in implementation:
struct __timespec64 { __time64_t tv_sec; int padding: 32; long tv_nsec; };
Similar to time_t, the existing struct timespec would remain in use in the implementation, and the public struct timespec would be either the implementation's struct timespec or an alias of the struct __timespec64 depending on _TIME_BITS==64.
non-zero paddings
Existing (and probably future) application code will not ensure padding is zero in struct timespec it uses; if such a struct timespec is passed or handled by the kernel as a true 64-bit struct timespec (for instance because the application code and GLIBC run above a 64-bit architecture kernel) then there is a risk that the 64-bit tv_nsec field contain an illegal value even though its 32 lower bits were properly set.
Ensuring that the padding in a struct timespec is zero must be performed by the kernel or GLIBC.
The non-Posix option
On the other hand, since on 64-bit systems, both fields are already 64 bits, it would make sense to use the same physical layout for 32-bit systems handling 64-bit time, and have an int64_t tv_nsec: this would ease exchanging struct timespec values between GLIBC and the kernel. The definition would then be:
struct timespec64 { __time64_t tv_sec; int64_t tv_nsec; };
struct timeval
struct timeval contains two fields, a time_t called tv_sec and a suseconds_t called tv_usec
The time_t should be replaced by a __time64_t in order for post-Y2038 dates to be represented.
The suseconds_t type of tv_usec is mandated by Posix to be able to hold numbers in the range [-1, 1000000], but the actual size is left to implementers.
As there is no mandated size for suseconds_t, and as on 64-bit systems it is 64 bits, it would make sense to use 64 bits on 32-bit systems handling 64-bit time.
Therefore a possible implementation would be
struct timeval64 { __time64_t tv_sec; int64_t tv_usec; };
for both little- and big-endian systems.
However, for the sake of maximum performance, tv_usec could be defined as a signed 32-bits field with a signed 32-bit padding, similar to struct timespec, in order to allow operations involving tv_usec to use 32-bit arithmetic rather than the slower 64-bit arithmetic; but then, glibc would have to sign-extend tv_usec into its padding in every struct timeval received from application code.
Similar to time_t and struct timespec, the existing struct timeval would be used directly in the implementation, or made an alias of struct __timeval64 depending on _TIME_BITS==64.
GLIBC types vs kernel types
Some type names, such as struct timespec, are declared in both GLIBC and the kernel.
This poses a problem when, for instance, GLIBC must implement an API receiving a GLIBC 'struct timespec' by passing a kernel 'struct timespec' to a kernel syscall; the same implementation cannot use the two different types with the same name.
The solution chosen is to redefine the kernel types in an internal GLIBC header, similar to what is done for struct stat in sysdeps/unix/sysv/linux*/kernel_stat.h.
For instance, corresponding to the GLIBC struct timespec, there would be one or several sysdeps/unix/sysv/linux*/time_spec.h defining struct kernel_time_spec and struct kernel_time_spec64.
API functions
This sections list examples of Y2038-compatible APIs with their rationale
difftime()
double difftime(time_t time1, time_t time0) takes two date/time values and returns their difference.
From an API standpoint, it is Y2038-incompatible because :
dates beyond Y2038 cannot be passed to it through time1 or time0;
if time1 and time0 could hold 64-bit time, then their difference might not be able to fit in a double
There is therefore a need for a 64-bit counterpart to difftime. Argument types are obvious:
double difftime64(__time64_t time1, __time64_t time0)
However, the double result type might not hold enough fraction bits for a 64-bit difference, which could cause imprecision of the result. In order to keep the function as precise as possible, its result would have to be extended:
long double difftime64(__time64_t time1, __time64_t time0)
But then, application code using this API would have to be written differently for 32-bit and 64-bit time cases.
Here, we will assume that difftime is used to compute time differences which will never be bigger than what a double can hold in its fraction (which, for a 64-bit GCC, is empirically about 15 decimal digits, which allows encoding about 999 999 999 999 999 seconds, over 31 million years).
We will thus choose this definition:
double difftime64(__time64_t time1, __time64_t time0)
The existing difftime implementation would be turned into a wrapper around difftime64, and later when adding __TIME_BITS support to the API, then either difftime would be made a public symbol, or it would be mapped to difftime64 depending on _TIME_BITS==64.
Thus, since time_t would also be either time_t or __time64_t depending on _TIME_BITS==64, the API's difftime (time_t, time_t) would actually be implemented as difftime32 (time_t, time_t) when _TIME_BITS is not equal to 64, or as difftime64 (__time64_t, __time64_t) when _TIME_BITS==64.
clock_gettime()
As far as the API is concerned, (clock_gettime(clockid_t, struct timespec *) would be duplicated the same way as difftime, with a 32-bit-callable clock_gettime32(clockid_t,struct timespec32 *) and a 64-bit-callable clock_gettime64(clockid_t,struct timespec64 *), plus a clock_gettime symbol defined as either clock_gettime32 or clock_gettime64.
Also, as GLIBC should work on 32-bit-time as well as 64-bit-time kernels, both clock_gettime32 and clock_gettime64 would have to determine whether the kernel is Y2038-compatible and thus provides 64-bit clock_gettime/clock_settime syscalls, and try these first; only if they failed with ENOSYS would GLIBC try the 32-bit syscalls.
The implementations would also have to handle the cases where:
A 32-bit client wants to get the time but the kernel is Y2038-compatible and the current date is post-Y2038 and thus cannot fit in a 32-bit time_t: in this case, clock_gettime should return -1 and set errno to EOVERFLOW.
A 64-bit client wants to set the time to a date post-Y2038 but the kernel is not Y2038-compatible: in this case, clock_settime should return -1 and set errno to EOVERFLOW.
implementation design
**UPDATE** following the first non-RFC submission in June 2018, it has bee indicated that some of the Y2038 patches should also end up in GNUlib, the process being that patches are submitted to GLIBC but reviewed in light of their future integration in both GLIBC and Gnulib.
Therefore following sections may be meant for GNUlib as well as GLIBC.
Since GLIBC must implement 64-bit time support in addition to the existing 32-bit time support, implementation should consist in the addition of:
- new implementation types
- new implementation functions
However, to avoid maintaining two sets of implementations, when introducing a 64-bit-time implementation, the corresponding 32-bit-time implementation would be made a wrapper around the 64-bit-time one.
For architectures where time_t is already 64-bit, the 'new' implementation would also be added, and used, by making __time64_t an alias of time_t (ditto for other types), and the 'old' wrapper would be useless; therefore it would be placed under conditional compilation.
new implementation types
__time_t / __time64_t
Internally, GLIBC uses __time_t as its 'native size time_t'. For 32-bit GLIBC builds, it is 32-bits; for 64-bit builds, it is 64 bits. We therefore need a type which is 64 bits in 32-bit builds. This type is declared as __time64_t.
Note: Gnulib does not define time_t, and relies entirely on time_t being provided by the underlying system and libraries (possibly GLIBC).
struct __timespec64
Similarly to __time_t, struct timespec needs a 64-bit counterpart. This counterpart is named struct __timespec64, and has two variants, based on the architecture endianness.
However, contrary to __time64_t, struct __timespec64 has different public and private definitions, so that the padding (needed to meet both Posix and Linux constraints) is only accessible in the private definition.
struct __itimerspec
The struct itimerspec 64-bit counterpart is named struct __itimerspec64.
new implementation functions
This section lists some of the Y2038-compatible API function implementations in GLIBC.
It is intended as an example, and the implementations described hereafter are chosen as typical cases.
kernel-independent implementations - difftime()
To avoid introducing any bug in the Y2038-proof form of difftime , The original code does not make any assumptions on the nature of __time64_t.
Note must be made however that the return value of __difftime64() is a double, which *may* not be sufficient to hold all 64-bit time difference values. However, changing this return value might result in failure to properly compile existing application source code.
Additional internal changes
Some of GLIBC's implementation uses time_t internally, and may thus be Y2038-incompatible and require changes in order to become Y2038-compatible.
The following table lists functions which may have to be modified to be Y2038-compatible.
Note: this table was generated automatically by looking for functions which contain Y2038-incompatible type names; there may be false positives. These will be weeded out while going through the table to fix the actually Y2038-incompatible functions.
Y2038-incompatible internal functions |
addgetnetgrentX(struct database_dyn *, int, int *, const char *, uid_t, struct hashentry *, struct datahead *, struct dataset **) |
addgrbyX(struct database_dyn *, int, request_header *, union keytype, const char *, uid_t, struct hashentry *, struct datahead *) |
addhstaiX(struct database_dyn *, int, request_header *, void *, uid_t, struct hashentry *const, struct datahead *) |
addhstbyX(struct database_dyn *, int, request_header *, void *, uid_t, struct hashentry *, struct datahead *) |
addinitgroupsX(struct database_dyn *, int, request_header *, void *, uid_t, struct hashentry *const, struct datahead *) |
addinnetgrX(struct database_dyn *, int, int *, char *, uid_t, struct hashentry *, struct datahead *) |
addpwbyX(struct database_dyn *, int, request_header *, union keytype, const char *, uid_t, struct hashentry *, struct datahead *) |
addservbyX(struct database_dyn *, int, request_header *, char *, uid_t, struct hashentry *, struct datahead *) |
bigtime_test(int) |
cache_addgr(struct database_dyn *, int, request_header *, const void *, struct group *, uid_t, struct hashentry *const, struct datahead *, int) |
cache_addhst(struct database_dyn *, int, request_header *, const void *, struct hostent *, uid_t, struct hashentry *const, struct datahead *, int, int32_t) |
cache_addpw(struct database_dyn *, int, request_header *, const void *, struct passwd *, uid_t, struct hashentry *const, struct datahead *, int) |
cache_addserv(struct database_dyn *, int, request_header *, const void *, struct servent *, uid_t, struct hashentry *const, struct datahead *, int) |
ctime_r(const time_t *, char *) |
dbg_log(const char *, ...) |
do_notfound(struct database_dyn *, int, int *, const char *, struct dataset **, ssize_t *, time_t *, char **) |
do_prepare(int, char **) |
do_test() |
evConsTime(struct timespec *, time_t, long) |
first_shoot(const_nis_name, directory_obj *) |
ftime(struct timeb *) |
__getdate_r(const char *, struct tm *) |
__get_nprocs() |
getopt(int, char *const *, const char *) |
__gettimeofday(struct timeval *, struct timezone *) |
__gmtime_r(const time_t *, struct tm *) |
guess_time_tm(long_int, long_int, int, int, int, const time_t *, const struct tm *) |
help_filter(int, const char *, void *) |
hunt(timezone_t, char *, time_t, time_t) |
libc_freeres_ptr(time_t *) |
__localtime_r(const time_t *, struct tm *) |
main(int, char **) |
main_loop_poll() |
mktime_internal(struct tm *, struct tm *(*)(const time_t *, struct tm *), time_t *) |
mktime_test1(time_t) |
mktime_test(time_t) |
my_gmtime_r(time_t *, struct tm *) |
my_localtime_r(const time_t *, struct tm *) |
nscd_run_prune(void *) |
__offtime(const time_t *, long, struct tm *) |
quantize_timeval(struct timeval *) |
ranged_convert(struct tm *(*)(const time_t *, struct tm *), time_t *, struct tm *) |
restart() |
restart_p(time_t) |
show(timezone_t, char *, time_t, int) |
__sleep(unsigned int) |
spring_forward_gap() |
start_threads() |
__strptime_internal(const char *, const char *, struct tm *, void *) |
subtract(time_t, time_t) |
tformat() |
thread_burn_any_cpu(void *) |
time_t_add_ok(time_t, time_t) |
time_t_avg(time_t, time_t) |
time_t_int_add_ok(time_t, int) |
__tzfile_compute(time_t, int, long *, int *, struct tm *) |
__tzfile_read(const char *, int, char **) |
utime(const char *, const struct utimbuf *) |
verify_persistent_db(void *, struct database_pers_head *, int) |
xcalloc(int, int) |
ydhms_diff(long_int, long_int, int, int, int, int, int, int, int, int) |
yeartot(intmax_t) |
zdump_localtime_rz(timezone_t, time_t *, struct tm *) |
Implementation
Based on Arnd Bergmann's Linux Y2038 patches, a GLIBC Y2038 is being maintained as a Work-In-Progress implementation of the above.
This branch can be found at https://sourceware.org/git/?p=glibc.git;a=log;h=refs/heads/aaribaud/y2038-2.26
The WIP branch for the upcoming second RFC can be found at https://sourceware.org/git/?p=glibc.git;a=log;h=refs/heads/aaribaud/y2038-2.26-rfc-2
BEWARE -- rebases will happen!
General implementation
Implementation involves :
- Extending the API with Y2038-compatible types;
- Extending the API with Y2038-compatible function prototypes using the new types;
- Adding new Y2038-compatible implementations for the new function prototypes;
- Modifying existing implementations to use Y2038-compatible time by turning them into wrappers around the new implementations.
- Modifying internal functions to use the new types and functions;
- Modifying the API to map API type and function names to either the original 32-bit-time ones or the new 64-bit time ones.
The last action is the one which changes the actual API provided by glibc (but only when __TIME_BITS==64. This is a global choice for a given compilation: either all of the API uses Y2038-proof time, or none of it does.
In order to make the changes manageable for review, they are spplit as follows:
- one commit for each new API type
- one commit for each new API function, including the prototype and implementation of the new function and the turning of existing implementation into a wrapper
- one commit for each minimal internal function change (a minimal change may modify more than one function and its callers).
one last for the __TIME_BITS==64 support in the whole API
Detailed implementation
The following table lists all types, implementations, and internal functions in the change series, along with their current status.
The GNUlib and GLIBC column indicate the status of the change relative to each project, using the following convention:
- Submitted: at least one patch has been submitted to the project, but it needs review and/or rework
- Acceptable: the last patch submitted has been reviewed and found acceptable for integration
- Integrated: the type, implementation or internal function has been been integrated in the project
- N/A: the type, implementation or internal function does not apply to this project
- An empty cell means the type, implementation or internal function has not been submitted for review yet
Symbol |
Nature |
GLIBC |
Gnulib |
__time64_t |
type |
Submitted |
N/A |
__tz_convert |
internal function |
Submitted |
|
__mktime_internal |
internal function |
|
|
__difftime64 |
implementation |
Submitted |
|
__y2038_get_kernel_support |
internal function |
Submitted |
|
__y2038_set_kernel_support |
internal function |
Submitted |
|
__timespec64 |
type |
Submitted |
|
__clock_gettime64 |
implementation |
Submitted |
|
__clock_settime64 |
implementation |
Submitted |
|
__clock_getres_time64 |
implementation |
Submitted |
|
__clock_nanosleep64 |
implementation |
|
|
__timespec_get64 |
implementation |
|
|
__utimensat_time64 |
implementation |
|
|
__futimens64 |
implementation |
|
|
__sigtimedwait_time64 |
implementation |
|
|
__timeval64 |
type |
|
|
__futimes64 |
implementation |
|
|
__lutimes64 |
implementation |
|
|
__itimerspec64 |
type |
|
|
__timer_gettime64 |
implementation |
|
|
__timer_settime64 |
implementation |
|
|
__timerfd_gettime64 |
implementation |
|
|
__timerfd_settime64 |
implementation |
|
|
__stat64_t64 |
type |
|
|
__fstat64_time64 (and __fxstat64_time64) |
implementation |
|
|
__stat64_time64 (and __xstat64_time64) |
implementation |
|
|
__lstat64_time64 (and __lxstat64_time64) |
implementation |
|
|
__fstatat64_time64 (and __fxstatat_time64) |
implementation |
|
|
__gettimeofday64 |
implementation |
|
|
__settimeofday64 |
implementation |
|
|
__time64 |
implementation |
|
|
__stime64 |
implementation |
|
|
__utimes64 |
implementation |
|
|
__mq_timedreceiv_time64 |
implementation |
|
|
__mq_timedsend_time64 |
implementation |
|
|
__msgctl64 |
implementation |
|
|
__sched_rr_get_interval64 |
implementation |
|
|
__nanosleep64 |
implementation |
|
|
__adjtime64, __adjtimex64 and __ntp_adjtime64 |
implementation |
|
|
__utime64 |
implementation |
|
|
__itimerval64 |
type |
|
|
__getitimer64 |
implementation |
|
|
__setitimer64 |
implementation |
|
|
functions using futexes |
implementation |
|
|
__getrusage64 |
implementation |
|
|
__ntp_timeval64 |
type |
|
|
__ntp_gettime64 |
implementation |
|
|
__ntp_gettimex64 |
implementation |
|
|
__timex64 |
type |
|
|
pselect64 |
implementation |
|
|
select64 |
implementation |
|
|
__clntudp_create64 |
implementation |
|
|
__clntudp_bufcreate64 |
implementation |
|
|
__pmap_rmtcall64 |
implementation |
|
|
_TIME_BITS |
macro |
|
|
RPC implementation
RPC has three API functions which use struct timeval arguments. All these arguments are timeouts, not absolute dates.
Since struct timeval will be affected by the API change, these three functions need to be implemented for 64-bit time.
However, since they receive only timeouts, and since it is reasonable to assume that no RPC timeout will be greater than 2**31 seconds, these implementations will just:
- check that their 64-bit-time timeout argument does fit in a signed 32-bit time representation, or return an error;
- convert the 64-bit timeout into a 32-bit timeout;
- call their corresponding 32-bit implementation with the 32-bit timeout and pass the return value back to their caller.
Issues
Build-time kernel/glibc incompatibilities
There is an open problem whereby incompatible kernel and glibc headers might be used at build time, e.g. new, 64-bit-time-enabled, kernel headers and old, 32-bit-time-only-capable, glibc headers.
In this case, there will be a mismatch between what the glibc uses and what the kernel expects; for instance, the kernel will expect a 64-bit time_t while glibc will use a 32-bit time_t.
The kernel cannot detect this at all, as kernel headers should not rely on glibc defines.
Older (pre-Y2038-support) glibcs cannot detect this either.
Such a detection will need to rest on an application-level mechanism.
Run-time kernel/glibc incompatibilities
Whatever time_t size the client code asks, the glibc code will rely on the running kernel which might provide only 32-bit time, or only 64-bit time, or both (providing neither is impossible).
Therefore, glibc Y2038-compatible functions must try and use 64-bit syscalls, which may fail with ENOSYS, in which case they should fall back to 32-bit syscalls, which may also fail with ENOSYS (but then the 64-bit syscall would have succeeded).
In either cases, 64-bit time 'rvalues' may not fit in 32-bit time 'lvalues'. For instance, calling the 64-bit-time gettimeofday() on behalf of a 32-bit-time application will work until Y2038, at which point glibc will have to return EOVERFLOW. Conversively, a 64-bit application can call settimeofday() on a 32-bit-time kernel until Y2038, at which point glibc will return EOVERFLOW.
utmp types and APIs
The issue of utmp is distinct from the issue of time_t size, because utmp is essentially unrelated to the kernel, and essentially concerned with the structure of the utmp, wtmp and btmp files.
In order to support several glibcs running on the same file system, the utmp/wtmp/btmp file format must be defined independently from the architecture word size and endianness.
On the other hand, the utmp API must conform to POSIX, which means the argument sizes and endianness will vary from architecture to architecture.
In addition, there must be a way to transition from a system's current format to the future single format for the utmp, wtmp and btmp files.
The proposed solution would implement the following changes:
- Introduce a Y2038-compatible structure for 'new' utmp file records, with field ut_tv.tv_sec defined as a signed 64-bit int, ideally with an explicit endianness.
- Make the API utmp[x] struct's ut_tv field a struct timeval as expected by POSIX.
- Whenever necessary, translate between struct utmp[x] values and utmp file records; when translating a 64-bit tv_sec into a 32-bit tv_sec, if the value would overflow, then set errno to EOVERFLOW and return a failure value.
- At run time, check for file utmp.trans. If it exists, then use file utmp to read and write the utmp state in 'old' format, and keep a 'new' format copy in utmp.trans. If utmp.trans does not exist, then use utmp to read and write the utmp state in the 'new' format. Also, if a 'new' GLIBC detects that changed were made in the utmp file but were not also made in the new utmp.trans file because the changes were performed by an 'old' GLIBX, then the 'new' GLIBC should copy the changes over from the utmp file to the utmp.trans file on behalf of the 'old' GLIBC.
- System upgrade from 'old' to 'new' format would be as follows: once all glibcs on the system are 'new' ones, all able to handle the 'new' format, and all system applications are ready to handle the new utmp format, then the utmp.trans file should be renamed into the utmp file (thus deleting the old utmp content).
References
Y2038 information: http://kernelnewbies.org/y2038
Y2038 Linux patch series (WIP): https://lists.linaro.org/pipermail/y2038/2015-May/000233.html