Bug 11620 - Bad design of timezone conversions
Summary: Bad design of timezone conversions
Status: REOPENED
Alias: None
Product: glibc
Classification: Unclassified
Component: time (show other bugs)
Version: unspecified
: P2 enhancement
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-22 07:57 UTC by Hadmut Danisch
Modified: 2018-10-15 23:59 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Hadmut Danisch 2010-05-22 07:57:45 UTC
Hi, 

glibc has pretty good functions and timezone definitions to convert
unix/universal time to local time zones. 

But unfortunately the functions are based on the assumption that you always need
only one timezone at a time, your local time. glibc supports using only a single
time zone per program run, the one set in the TZ environment variable. 

This is design of the pre-internet era. Nowadays we have to write programs like
webservers and other communication servers, which can deal with any timezone
requested and with several timezones simultaneously. 

You can experience that lack of functionality in the fact, that most programs,
that offer a customer to configure his time zones do not allow him to choose,
e.g. Europe/Berlin, but timeoffsets only, like GMT+1, GMT+2, which needs to
updated for every change between summer and wintertime or simply gives wrong
data, e.g. in a calendar. 

Would be nice and appropriate if there were functions to read in any time zone
definition given by name into a variable and to have conversion functions like
localtime and mktime to use with any time zone definition passed as a variable. 

Since these functions already exist, they just have to be modified to not use a
static variable but a given parameter, this should be easy to implement. 

regards
Comment 1 Ulrich Drepper 2010-05-22 13:49:01 UTC
Nobody is stopping you from designing your own interfaces and then put them in
their own library.  There is absolutely no reason whatsoever that any such set
of interfaces have to be in the C library.
Comment 2 Maxim Egorushkin 2012-11-08 11:21:32 UTC
I would like to re-open this issue now, much has changed since the last comment.

glibc has a timezone database parser. It only makes perfect sense to reuse the functionality and provide extra functions for converting times from any timezone to UTC and back in a useful and thread-safe manner. Many modern application require this functionality.
Comment 3 Maxim Egorushkin 2012-11-08 23:57:21 UTC
I just wanted to expand on what I said earlier because I really believe this would be a major improvement in timezone handling.

Ulrich mentioned that "There is absolutely no reason whatsoever that any such set
of interfaces have to be in the C library.". It seems to me that for this argument to be valid there should not be examples to the contrary. 

There is a standard C library function gmtime(). There is no reverse function for it in the C standard. People often ask for the reverse of this function and a quick google search for "gmtime reverse" retrieves around 1.3e6 hits, as of today. Yet, glibc does provide a reverse of it, timegm(), a more than just a useful function judging from the number of the web search hits. 

This example appears to contradict Ulrich's statement that was used as a justification for closing this request. It feels like the reasoning for closing this ticket was less than perfect and I felt compelled to re-open it. 

I looked at the sources of glibc today and it looks like __tzfile_read() loads an entire Olson database into memory. This database allows conversion from any timezone to UTC and back. Glibc provides functions for converting back and forth between UTC and the current local timezone only. To complete the picture, there is a "solution" mentioned in NOTES of man timegm, that involves changing TZ environment variable to temporarily switch the local timezone to another one to force mktime() convert struct tm expressed in that other timezone into a time_t. This solution doesn't feel quite satisfactory and it is not thread-safe (unless one explicitly holds a mutex while changing TZ environment variable everywhere in the code, which may be harder to achieve in the presence of 3rd-party libraries without the source code).

To summarize, the Olson timezone database has always been in glibc's memory, but applications haven't been able to fully utilize that, or at least, without that verbose and unreliable code, which also doesn't seem to be a common knowledge.

I would propose adding a couple of functions to fill this gap:

    time_t time_tz_to_utc(<timezone> tz, struct tm* from);
    struct tm* time_utc_to_tz(<timezone> tz, time_t);

The first function would convert a broken-down time expressed in timezone tz into time_t (which is the number of seconds since UTC epoch).

The second function would do the opposite.

<timezone> should probably be a pointer to a structure, rather than a timezone string, e.g. "Europe/Paris", to avoid string look-ups on each call. E.g.:

    typedef struct __Timezone* timezone_t;
    timezone_t time_find_timezone(char const* olson_tz_name);
    void time_release_timezone(timezone_t);

I remember reading other proposals to evolve the C standard library time functions, but can't find it right now. It may provide more well-thought interfaces. But, I think, for the majority of us this would be a major step forward.

-- Maxim
Comment 4 Maxim Egorushkin 2012-11-09 00:23:00 UTC
...

Because of the lack of this functionality in glibc there is a proliferation of different tzdata copies in Linux. Python's pytz comes with a copy, ICU with another one, to name a few.

Worst case scenario is that an application using one copy of tzdata talks to another application using another copy of tzdata using non-local timezone timestamps. tzdata mismatch can lead to unexpected behaviour, especially for recent dates if one library has upgraded to the latest tzdata when the other hasn't. (This problem, though, is inherent in web applications served by different hosts, which can't be reasonably expected to have tzdata in-sync, hence to mitigate the problem they must only talk UTC).

Different hosts aside, keeping just one copy of tzdata on a system feels to be also a sensible idea.

Okay, I will try to keep trolling down now...
Comment 5 Andreas Schwab 2012-11-09 00:46:36 UTC
> Yet, glibc does provide a reverse of it, timegm(),

Only because it was prior art as part of the Olsen implementation.

> I looked at the sources of glibc today and it looks like __tzfile_read() loads
> an entire Olson database into memory.

No, it doesn't.  It loads a single time zone file.
Comment 6 Maxim Egorushkin 2012-11-09 10:34:24 UTC
(In reply to comment #5)

> > I looked at the sources of glibc today and it looks like __tzfile_read() loads
> > an entire Olson database into memory.
> 
> No, it doesn't.  It loads a single time zone file.

I stand corrected. 

Doesn't change the fact that it is capable of loading any timezone file on demand though.
Comment 7 David Lang 2013-09-20 11:01:05 UTC
I'll start by saying that I have not yet looked at the code

But it seems to me that logically this code is going to have to be something along the lines of 

gmtime(){
  get TZ variable
  load zone
  do conversion
}

Instead of all this being in one function, it should be pretty trivial to split it to be

gmtime(){
  get TZ variable (existing code)
  *zone load_zone(*char tz)
  convert_to_gmt(*zone zone, time)
}
load_zone(){
  (existing code)
}
convert_to_gmt(){
  (existing code)
}

in other words this seems like it should be a code restructuring into helper functions, and then exposing the helper functions to the application rather than being significant new code to be written.

This isn't as nice an interface as convert_tz(*char from_tz, *char to_tz, time), but it should be much easier to implement, and require far less maintinance since this should be existing code
Comment 8 Maxim Egorushkin 2014-11-25 10:26:49 UTC
Upstream tz project exposes the functions required for easy and thread-save timezone conversions, see https://github.com/eggert/tz/blob/master/private.h#L409

    /*
    ** Define functions that are ABI compatible with NetBSD but have
    ** better prototypes.  NetBSD 6.1.4 defines a pointer type timezone_t
    ** and labors under the misconception that 'const timezone_t' is a
    ** pointer to a constant.  This use of 'const' is ineffective, so it
    ** is not done here.  What we call 'struct state' NetBSD calls
    ** 'struct __state', but this is a private name so it doesn't matter.
    */
    #if NETBSD_INSPIRED

    typedef struct state *timezone_t;
    struct tm *localtime_rz(timezone_t restrict, time_t const *restrict, struct tm *restrict);
    time_t mktime_z(timezone_t restrict, struct tm *restrict);
    timezone_t tzalloc(char const *);
    void tzfree(timezone_t);

    /* ... */
    #endif

Would it be possible to expose these NETBSD_INSPIRED 4 functions in glibc?

-- Maxim
Comment 9 David Ward 2016-12-13 22:34:03 UTC
(In reply to Maxim Yegorushkin from comment #8)
> Upstream tz project exposes the functions required for easy and thread-save
> timezone conversions, see
> https://github.com/eggert/tz/blob/master/private.h#L409
> 
> Would it be possible to expose these NETBSD_INSPIRED 4 functions in glibc?

This now appears on the glibc master to-do list: https://sourceware.org/glibc/wiki/Development_Todo/Master#MT-Safe_and_TZ-aware_functions


I also would like to see these functions added to glibc, especially for thread safety reasons. The tz project implements these functions in a static library (libtz.a), but that is only intended to be used directly with the bundled utilities, as can be seen at the top of the header file referenced above:

    /*
    ** This header is for use ONLY with the time conversion code.
    ** There is no guarantee that it will remain unchanged,
    ** or that it will remain at all.
    ** Do NOT copy it to any system include directory.
    ** Thank you!
    */


The same functions are also exposed by gnulib, but unfortunately its implementation is not thread-safe: the localtime_rz and mktime_z functions are essentially wrappers around localtime_r and mktime, which modify the TZ environment variable before or after if needed.
Comment 10 eggert 2018-10-15 23:43:58 UTC
Also see coreutils bugs 9614, 11748, and 14229. Adding a tzdb-style API will improve coreutils by letting it diagnose invalid TZ values. See:

https://bugs.gnu.org/9614
https://bugs.gnu.org/11748
https://bugs.gnu.org/14229