DRAFT: Y2038 Proofness Design


History

Revision 24 was reviewed on the libc-alpha mailing list as 1st draft: https://www.sourceware.org/ml/libc-alpha/2015-10/msg00893.html

Revision 55 was reviewed on the libc-alpha mailing list as 2nd draft: https://www.sourceware.org/ml/libc-alpha/2016-01/msg00832.html

Revision 63 was reviewed on the libc-alpha mailing list as 3rd draft: https://www.sourceware.org/ml/libc-alpha/2016-06/msg00243.html

Revision 83 was reviewed on the libc-alpha mailing list as 4th draft: https://www.sourceware.org/ml/libc-alpha/2016-06/msg00824.html

Scope

The intent of this page is to serve as a central point for describing the Y2038 'proofness' design.

Y2038 'proofness' means that application calls to glibc-provided function should never return wrong results when UTC times outside -231..231-1 seconds from the Unix Epoch are involved.

This document is only about Y2038 (and Y1901) and is not about any other time boundaries such as Y2106 (unsigned 32-bit Epoch-based times) or Y9999 (four-digit years).

For example, asctime_r in practice might possibly not be Y9999-proof (for instance, it might overrun a too-small output buffer). However, this document is not about Y9999; it is about Y2038, for which asctime_r is safe since it does not handle Epoch-based 32-bit signed values.

Useful Definitions

Goals

Constraints

There are a number of constraints which dictate the direction of the design. They are either definite or debatable. Debatable design preclusions should be finalized before this design document leaves DRAFT.

Definite

Debatable

Analysis

This section details the problems with Y2038 and glibc.

Y2038-incompatible types and APIs

glibc provides many APIs which involve time. Among these:

This sections describes the Y2038-incompatible glibc APIs, which need changes in order to become Y2038-compatible.

Y2038-incompatible types

These are types whose values include a date and which cannot properly denote dates past Y2038.

The essential example is time_t, which is a 32-bit signed integer count of seconds since the Epoch; its maximum value is Y2038.

By extension, any struct containing at least one time_t is not Y2038-compatible.

non-Y2038-compatible types

Types containing an Epoch-based value

time_t

Types using time_t

struct lastlog

struct msqid_ds

struct semid_ds

struct timeb

struct timespec

struct timeval

struct utimbuf

Types using struct timespec

struct itimerspec

struct stat

Types using struct timeval

struct clnt_ops

struct elf_prstatus

struct itimerval

struct ntptimeval

struct rusage

struct timex

struct utmp

struct utmpx

Y2038-incompatible APIs

APIs which do not involve any Y2038-incompatible type are Y2038-compatible, whereas APIs which involve at least one Y2038-incompatible type are Y2038-incompatible.

This applies to APIs which involve a pointer to a Y2038-incompatible type.

Below is the list of Y2038-incompatible APIs, based on the list of Y2038-incompatible types.

Y2038-incompatible APIs

APIs using time_t

char * ctime (const time_t *)

char * ctime_r (const time_t *, char *)

double difftime (time_t, time_t)

struct tm * gmtime (const time_t *)

struct tm * gmtime_r (const time_t *, struct tm *resultp)

struct tm * localtime (const time_t *)

struct tm * localtime_r (const time_t *, struct tm *)

time_t mktime (struct tm *brokentime)

int stime (const time_t *)

time_t timegm (struct tm *brokentime)

time_t timelocal (struct tm *brokentime)

time_t time (time_t *result)

APIs using struct lastlog

(none)

APIs using struct msqid_ds

int msgctl(int, int, struct msqid_ds *)

APIs using struct semid_ds

(none)

APIs using struct timeb

int ftime(struct timeb *)

APIs using struct timespec

int aio_suspend (const struct aiocb *const *, int, const struct timespec *restrict)

int clock_getres (clockid_t, struct timespec *)

int clock_gettime (clockid_t, struct timespec *)

int clock_nanosleep (clockid_t, int, const struct timespec *, struct timespec *)

int clock_settime (clockid_t, const struct timespec *)

int futimens (int, const struct timespec *)

ssize_t mq_timedreceive (mqd_t, char *restrict, int, unsigned int *restrict, const struct timespec *restrict)

int mq_timedsend (mqd_t, const char *, int, unsigned int, const struct timespec *)

int nanosleep (const struct timespec *, struct timespec *)

int pselect (int, fd_set *restrict, fd_set *restrict, fd_set *restrict, const struct timespec *restrict, const __sigset_t *restrict)

int pthread_cond_timedwait (pthread_cond_t *restrict, pthread_mutex_t *restrict, const struct timespec *restrict)

int pthread_mutex_timedlock (pthread_mutex_t *restrict, const struct timespec *restrict)

int pthread_rwlock_timedrdlock (pthread_rwlock_t *restrict, const struct timespec *restrict)

int pthread_rwlock_timedwrlock (pthread_rwlock_t *restrict, const struct timespec *restrict)

int sched_rr_get_interval (__pid_t, struct timespec *)

int sem_timedwait (sem_t *restrict, const struct timespec *restrict)

int sigtimedwait (const sigset_t *restrict, siginfo_t *restrict, const struct timespec *restrict)

int timespec_get (struct timespec *, int)

int utimensat (int, const char *, const struct timespec *, int)

APIs using struct timeval

int adjtime (const struct timeval *, struct timeval *)

enum clnt_stat pmap_rmtcall (struct sockaddr_in *, const u_long, const u_long, const u_long, xdrproc_t, caddr_t, xdrproc_t, caddr_t, struct timeval, u_long *)

CLIENT * clntudp_bufcreate (struct sockaddr_in *, u_long, u_long, struct timeval, int *, u_int, u_int)

CLIENT * clntudp_create (struct sockaddr_in *, u_long, u_long, struct timeval, int *)

int futimes (int, const struct timeval *)

int gettimeofday (struct timeval *restrict, __timezone_ptr_t)

int lutimes (const char *, const struct timeval *)

int select (int, fd_set *restrict, fd_set *restrict, fd_set *restrict, struct timeval *restrict)

int settimeofday (const struct timeval *, const struct timezone *)

int utimes (const char *, const struct timeval *)

APIs using struct utimbuf

int utime(const char *, const struct utimbuf *)

APIs using struct itimerspec

int timerfd_gettime(int, struct itimerspec *)

int timerfd_settime(int, int, const struct itimerspec *, struct itimerspec *)

int timer_gettime(timer_t, struct itimerspec *)

int timer_settime(timer_t, int, const struct itimerspec *restrict, struct itimerspec *restrict)

APIs using struct stat

int fstatat(int, const char *restrict, struct stat *restrict, int)

int fstat(int, struct stat *)

int __fxstatat(int, int, const char *, struct stat *, int)

int __fxstat(int, int, struct stat *)

int lstat(const char *restrict, struct stat *restrict)

int __lxstat(int, const char *, struct stat *)

int stat(const char *restrict, struct stat *restrict)

int __xstat(int, const char *, struct stat *)

APIs using struct clnt_ops

(none)

APIs using struct elf_prstatus

(none)

APIs using struct itimerval

struct itimerval

int setitimer (int which, const struct itimerval *new, struct itimerval *old)

int getitimer (int which, struct itimerval *old)

APIs using struct ntptimeval

int ntp_gettime(struct ntptimeval *)

APIs using struct rusage

int getrusage(__rusage_who_t, struct rusage *)

__pid_t wait3(int *, int, struct rusage *)

__pid_t wait4(__pid_t, int *, int, struct rusage *)

APIs using struct timex

int adjtimex (struct timex *timex)

int ntp_adjtime (struct timex *tptr)

APIs using struct utmp

int getutent_r(struct utmp *, struct utmp **)

struct utmp * getutent()

int getutid_r(const struct utmp *, struct utmp *, struct utmp **)

struct utmp * getutid(const struct utmp *)

int getutline_r(const struct utmp *, struct utmp *, struct utmp **)

struct utmp * getutline(const struct utmp *)

void login(const struct utmp *)

struct utmp * pututline(const struct utmp *)

void updwtmp(const char *, const struct utmp *)

APIs using struct utmpx

struct utmpx * getutxent()

struct utmpx * getutxid(const struct utmpx *)

struct utmpx * getutxline(const struct utmpx *)

struct utmpx * pututxline(const struct utmpx *)

Y2038-incompatible socket timestamping

Like ioctl(), setsockopt()/getsockopt() has a few interfaces that are passing time data:

SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING enable socket timestamping, which allows socket messages to be augmented with timestamps through cmsgs. More precisely:

Getting Y2038-compatible timestamping affects both the kernel and glibc:

The kernel must provide Y2038-compatible timestamps. This would be done indirectly through 64-bit versions of struct timeval and struct timespec.

However, to prevent incompatibilities with existing 32-bit-time code, the existing socket options would remain in place and new options SO_TIMESTAMP64, SO_TIMESTAMPNS64 and SO_TIMESTAMPING64 would be adde which would use Y2038-compatible timestamps.

glibc would use either the old Y2038-incompatible set of socket options or the new Y2038-compatible set depending on __USE_TIME_BITS64, and provide the correct 'struct timeval' and 'struct timespec' so that application source code works for both 32 and 64 bits time size.

SO_RCVTIMEO and SO_SNTTIMEO are socket options for controlling socket time-outs. They are passed along with a pointer to a struct timeval another argument specifying the struct's length.

This length could be used to determine whether a Y2038-incompatible and Y2038-compatible struct timeval was passed, but there is no way to distinguish a struct timeval by size between a true 64-bit kernel and a 32-bit kernel with 64-bit time support. To avoid risks, these options whould be handled the same way as SO_TIMESTAMP*, using new numbers for the flags.

NOTE: at the moment, GLIBC defines absolutely no SO_TIMESTAMP* macros. To introduce time-size-insensitive versions discussed above, these would have to be introduced in bits/sockets.h.

Y2038-incompatible IOCTLs

Some Linux IOCTLs may be Y2038-incompatible, or use types defined by glibc that do not match the kernel internal types. Known important cases are:

GLIBC itself uses some ioctls (which can be found by running git grep 'ioctl *(' -- "*.c" | sed -e 's/^.*ioctl *([^,]*, *\([^,)]*\)[,)].*$/\1/' on the GLIBC source repository, minus some false positives). None seem to be time-sensitive.

Y2038-compatible types and APIs

This section lists type and APIs which are time-related but are Y2038-compatible

Time-related but Y2038-compatible types

struct rpc_timeval

struct tm

struct sysinfo

Time-related but Y2038-compatible APIs

APIs using struct rpc_timeval

int rtime(struct sockaddr_in *, struct rpc_timeval *, struct rpc_timeval *)

APIs using struct tm

char * asctime (const struct tm *brokentime)

char * asctime_r (const struct tm *brokentime, char *buffer)

size_t strftime (char *s, size_t size, const char *template, const struct tm *brokentime)

size_t wcsftime (wchar_t *s, size_t size, const wchar_t *template, const struct tm *brokentime)

char * strptime (const char *s, const char *fmt, struct tm *tp)

struct tm * getdate (const char *string)

int getdate_r (const char *string, struct tm *tp)

APIs using struct sysinfo

int sysinfo (struct sysinfo *info)

Others

unsigned int alarm (unsigned int seconds) -- Strict Posix limits seconds to UINT_MAX

unsigned int sleep (unsigned int seconds) -- Strict Posix limits seconds to UINT_MAX

clock_t clock(void)

struct tms { clock_t tms_utime; clock_t tms_stime; clock_t tms_cutime; clock_t tms_cstime; } 

clock_t times(struct tms *buffer)

struct timezone { int tz_minuteswest; int tz_dsttime; } 

rlim_t

int sched_rr_get_interval (pid_t pid, struct timespec *interval)

Internal changes

Some functions use time_t internally, and may thus be Y2038-incompatible and require changes in order to become Y2038-compatible. The following table lists these functions:

Y2038-incompatible internal functions

addgetnetgrentX(struct database_dyn *, int, int *, const char *, uid_t, struct hashentry *, struct datahead *, struct dataset **)

addgrbyX(struct database_dyn *, int, request_header *, union keytype, const char *, uid_t, struct hashentry *, struct datahead *)

addhstaiX(struct database_dyn *, int, request_header *, void *, uid_t, struct hashentry *const, struct datahead *)

addhstbyX(struct database_dyn *, int, request_header *, void *, uid_t, struct hashentry *, struct datahead *)

addinitgroupsX(struct database_dyn *, int, request_header *, void *, uid_t, struct hashentry *const, struct datahead *)

addinnetgrX(struct database_dyn *, int, int *, char *, uid_t, struct hashentry *, struct datahead *)

addpwbyX(struct database_dyn *, int, request_header *, union keytype, const char *, uid_t, struct hashentry *, struct datahead *)

addservbyX(struct database_dyn *, int, request_header *, char *, uid_t, struct hashentry *, struct datahead *)

bigtime_test(int)

cache_addgr(struct database_dyn *, int, request_header *, const void *, struct group *, uid_t, struct hashentry *const, struct datahead *, int)

cache_addhst(struct database_dyn *, int, request_header *, const void *, struct hostent *, uid_t, struct hashentry *const, struct datahead *, int, int32_t)

cache_addpw(struct database_dyn *, int, request_header *, const void *, struct passwd *, uid_t, struct hashentry *const, struct datahead *, int)

cache_addserv(struct database_dyn *, int, request_header *, const void *, struct servent *, uid_t, struct hashentry *const, struct datahead *, int)

ctime_r(const time_t *, char *)

dbg_log(const char *, ...)

__difftime(time_t, time_t)

do_notfound(struct database_dyn *, int, int *, const char *, struct dataset **, ssize_t *, time_t *, char **)

do_prepare(int, char **)

do_test()

evConsTime(struct timespec *, time_t, long)

first_shoot(const_nis_name, directory_obj *)

ftime(struct timeb *)

__getdate_r(const char *, struct tm *)

__get_nprocs()

getopt(int, char *const *, const char *)

__gettimeofday(struct timeval *, struct timezone *)

__gmtime_r(const time_t *, struct tm *)

guess_time_tm(long_int, long_int, int, int, int, const time_t *, const struct tm *)

help_filter(int, const char *, void *)

hunt(timezone_t, char *, time_t, time_t)

libc_freeres_ptr(time_t *)

__localtime_r(const time_t *, struct tm *)

main()

main(int, char **)

main_loop_poll()

mktime_internal(struct tm *, struct tm *(*)(const time_t *, struct tm *), time_t *)

mktime_test1(time_t)

mktime_test(time_t)

my_gmtime_r(time_t *, struct tm *)

my_localtime_r(const time_t *, struct tm *)

nscd_run_prune(void *)

__offtime(const time_t *, long, struct tm *)

quantize_timeval(struct timeval *)

ranged_convert(struct tm *(*)(const time_t *, struct tm *), time_t *, struct tm *)

restart()

restart_p(time_t)

show(timezone_t, char *, time_t, int)

__sleep(unsigned int)

spring_forward_gap()

start_threads()

__strptime_internal(const char *, const char *, struct tm *, void *)

subtract(time_t, time_t)

tformat()

thread_burn_any_cpu(void *)

time_t_add_ok(time_t, time_t)

time_t_avg(time_t, time_t)

time_t_int_add_ok(time_t, int)

__tzfile_compute(time_t, int, long *, int *, struct tm *)

__tzfile_read(const char *, int, char **)

utime(const char *, const struct utimbuf *)

verify_persistent_db(void *, struct database_pers_head *, int)

xcalloc(int, int)

ydhms_diff(long_int, long_int, int, int, int, int, int, int, int, int)

yeartot(intmax_t)

zdump_localtime_rz(timezone_t, time_t *, struct tm *)

Implementation

General principles

In order to avoid duplicating APIs for 32-bit and 64-bit time, glibc will provide either one but not both for a given application; the application code will have to choose between 32-bit or 64-bit time support, and the same set of symbols (e.g. time_t or clock_gettime) will be provided in both cases.

The following is proposed:

This allows user code to check, by verifying whether __USE_TIME_BITS64 is defined once glibc headers are #included, that they are using a glibc which actually supports 64-bit time (or claims to).

For instance, providing 64-bit time support for the time() function would result in the following:

Y2038-compatible types

This sections list examples of Y2038-compatible types with their rationale

Y2038-compatible time_t

time_t is a 32-bit signed Epoch-related count of seconds, and is Y2038-incompatible because Y2038 is the maximum positive value of time_t.

Y2038-compatible APIs will need a type which can hold Epoch-related counts of seconds beyond Y2038; the most sensible choice for this type is 64-bit signed integer (denoted time64_t from now on), because:

Note that Posix currently does not mandate any specific time representation for handling the Y2038 problem, and while it considers a 64-bit time_t approach, it also considers just mandating that a future date be representable, leaving implementaters free to choose the details of said representation.

In order for applications to be written (as) independently of time size (as possible), there needs to be three 'time types' in GLIBC:

Y2038-compatible struct timespec

struct timespec contains two fields, a time_t called tv_sec and a long called tv_nsec

The time_t should be replaced by a time64_t in order for post-Y2038 dates to be represented.

The long type of tv_nsec is mandated by Posix. Also, even for 64-bit time, the number of microseconds within a second does not need more than 32 bits of representation, and could remain unchanged.

On the other hand, on 64-bit systems, both fields are 64 bits (which makes sense as a long is 64 bit on these systems), and it would make sense to use the same physical layout for 32-bit systems handling 64-bit time.

Therefore a possible implementation would be

struct timespec {
  time64_t tv_sec;
  long     tv_nsec;
  long     padding;
};

for little-endian systems, and

struct timespec64 {
  time64_t tv_sec;
  long     tv_nsec;
  long     padding;
};

for big-endian systems.

Similar to time_t, the existing struct timespec could be renamed struct timespec32, and struct timespec would be defined at application compile time as either struct timespec32 or struct timespec64 depending on _TIME_BITS==64.

Y2038-compatible struct timeval

struct timeval contains two fields, a time_t called tv_sec and a suseconds_t called tv_usec

The time_t should be replaced by a time64_t in order for post-Y2038 dates to be represented.

The suseconds_t type of tv_usec is mandated by Posix to be able to hold numbers in the range [-1, 1000000], but the actual size is left to implementors.

As there is no mandated size for suseconds_t, and as on 64-bit systems it is 64 bits, it would make sense to use 64 bits on 32-bit systems handling 64-bit time.

Therefore a possible implementation would be

struct timeval64 {
  time64_t tv_sec;
  int64_t  tv_nsec;
};

for both little- and big-endian systems.

However, for the sake of maximum performance, tv_nsec could be defined as a signed 32-bits field with a signed 32-bit padding, similar to struct timespec, in order to allow operations involving tv_nsec to use 32-bit arithmetic rather than the slower 64-bit arithmetic; but then, glibc would have to sign-extend tv_nsec into its padding in every struct timeval received from application code.

Similar to time_t and struct timespec, the existing struct timeval could be renamed struct timeval32, and struct timeval would be defined at application compile time as either struct timeval32 or struct timeval64 depending on _TIME_BITS==64.

Y2038-compatible APIs

This sections list examples of Y2038-compatible APIs with their rationale

difftime()

double difftime(time_t time1, time_t time0) takes two date/tyime values and returns their difference.

From an API standpoint, it is Y2038-incompatible because :

There is therefore a need for a 64-bit counterpart to difftime. Argument types are obvious:

double difftime64(time64_t time1, time64_t time0)

However, the double result type might not hold enough mantissa bits for a 64-bit difference, which could cause imprecision of the result. In order to keep the function as precise as possible, its result would have to be extended:

long double difftime64(time64_t time1, time64_t time0)

But then, application code using this API would have to be written differently for 32-bit and 64-bit time cases.

Here, we will assume that difftime is used to compute time differences which will never be bigger than what a double can hold in its mantissa (which, for a 64-bit GCC, is empirically about 15 decimal digits, which allows encoding about 999 999 999 999 999 seconds or roughly 31 million years).

We will thus choose this definition:

double difftime64(time64_t time1, time64_t time0)

As with types, the existing difftime would be renamed difftime32 and a difftime symbol would be declared at application compile time to be either difftime32 or difftime64 depending on the time bit-size default chosen at application compile-time.

Thus, since time_t would also be either time32_t or time64_t depending on _TIME_BITS==64, the whole difftime(time_t,time_t) would actually be difftime32(time32_t,time32_t) when _TIME_BITS is not equal to 64, and difftime64(time64_t,time64_t) when _TIME_BITS==64.

clock_gettime()

As far as the API is concerned, (clock_gettime(clockid_t, struct timespec *) would be duplicated the same way as difftime, with a 32-bit-callable clock_gettime32(clockid_t,struct timespec32 *) and a 64-bit-callable clock_gettime64(clockid_t,struct timespec64 *), plus a clock_gettime symbol defined as either clock_gettime32 or clock_gettime64.

Also, as GLIBC should work on 32-bit-time as well as 64-bit-time kernels, both clock_gettime32 and clock_gettime64 would have to determine whether the kernel is Y2038-compatible and thus provides 64-bit clock_gettime/clocksettime syscalls, and try these first; only if they failed with ENOSYS would GLIBC try the 32-bit syscalls.

The implementations would also have to handle the cases where:

Issues

Build-time kernel/glibc incompatibilities

There is an open problem whereby incompatible kernel and glibc headers might be used at build time, e.g. new, 64-bit-time-enabled, kernel headers and old, 32-bit-time-only-capable, glibc headers.

In this case, there will be a mismatch between what the glibc uses and what the kernel expects; for instance, the kernel will expect a 64-bit time_t while glibc will use a 32-bit time_t.

The kernel cannot detect this at all, as kernel headers should not rely on glibc defines.

Older (pre-Y2038-support) glibcs cannot detect this either.

Such a detection will need to rest on an application-level mechanism.

Run-time kernel/glibc incompatibilities

Whatever time_t size the client code asks, the glibc code will rely on the running kernel which might provide only 32-bit time, or only 64-bit time, or both (providing neither is impossible).

Therefore, glibc Y2038-compatible functions must try and use 64-bit syscalls, which may fail with ENOSYS, in which case they should fall back to 32-bit syscalls, which may also fail with ENOSYS (but then the 64-bit syscall would have succeeded).

In either cases, 64-bit time 'rvalues' may not fit in 32-bit time 'lvalues'. For instance, calling the 64-bit-time gettimeofday() on behalf of a 32-bit-time application will work until Y2038, at which point glibc will have to return EOVERFLOW. Concersively, a 64-bit application can call settimeofday() on a 32-bit-time kernel until Y2038, at which point glibc will return EOVERFLOW.

utmp types and APIs

The issue of utmp is distinct from the issue of time_t size, because utmp is essentially unrelated to the kernel, and essentially concerned with the structure of the utmp, wtmp and btmp files.

In order to support several glibcs running on the same file system, the utmp/wtmp/btmp file format must be defined independently from the architecture word size and endianness.

On the other hand, the utmp API must conform to POSIX, which means the argument sizes and endianness will vary from architecture to architecture.

In addition, there must be a way to transition from a system's current format to the future single format for the utmp, wtmp and btmp files.

The proposed solution would implement the following changes:

References

None: Y2038ProofnessDesign (last edited 2017-02-22 08:00:40 by AlbertAribaud)