This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Time Fixes Diff


On 09/10/2017 12:27 AM, Brian Inglis wrote:
On 2017-09-08 10:34, Brian Inglis wrote:
On 2017-09-08 09:00, Corinna Vinschen wrote:
On Sep  8 08:46, Brian Inglis wrote:
On 2017-09-08 02:18, Corinna Vinschen wrote:
On Sep  8 07:40, Freddie Chopin wrote:
On Thu, 2017-09-07 at 15:58 -0600, Brian Inglis wrote:
The question is, going forward, should we use long for bigger than
int values as usual, or use time_t instead, to consistently avoid
long, in these time functions.
I think we should go with time_t, which will be 32-bit or 64-bit long,
depending on what was chosen during newlib compilation.
While that's nice and all, two points to keep in mind:

* 64 bit computations are costly on 32 bit (or even 16 bit) platforms and
   should be avoided if overkill isn't your goal.

* days, months, years, etc, are *all* of type int in struct tm, per
   POSIX. Adding 64 bit ops just to fold stuff back into ints might only
   make marginal sense.
Max year == MAX_INT + 1900 == 2^31-1 + 1900 == 2147485547 on 32 bit platforms,
You wrote this before but it sounds like a pretty academic problem.
Check out the strftime.c regression test vectors before you conclude that.

if you want to avoid arbitrary limitations, and be as (Linux) compatible as
possible, which the existing library tries to do, but there are various issues
to be fixed. There are places where int, unsigned, long, and unsigned long are
used which may or may not work depending on time_t, int, and long sizes.
Glibc compat is nice, and as far as Cygwin is concerned I'm not
concerned for size.  However, newlib strives to small footprints
given that many embedded targets are using it.  This should be kept
in mind.

Is long bigger than int on all newlib platforms? What is bigger than int?

Using time_t as required makes temps and variables big enough to avoid overflow
for seconds, days, and years counts, without penalizing 32 bit time_t platforms.
Rather than int, long, long long you can use explicit sized ints
like int16_t, int32_t, uint64_t.
We need space for things that are bigger than ints up to a few to a couple
orders of magnitude smaller than time_t, not even half the length, so picking
types smaller than time_t does not seem to be worth the code?

For I/O conversion of 64 bit time_t max year on 32 bit platforms long long is
required, it may also be used with 32 bit time_t to avoid conditionals, and it
imposes no extra overhead on 64 bit platforms.

Everything depends on how many conditionals we want to use to minimize the data
and code space and time spent at runtime and for development.
I'm inclined to use common code and types that will work correctly on all
platforms *in places where they are needed* to get things working correctly,
then we can worry about space and time reductions if required?

I can post diffs for discussion before making any local commits or patches.
Attached for discussion.

Used local.h definitions, removed some local duplications, changed
tests/compares of > ... - 1) to >= ..., changed declarations to time_t and added
casts for large secs, days, years counts to avoid overflow with extreme values,
cast to long long for I/O mainly snprintf.

Not yet tested due to unrelated build issues on other files in latest git pull.

If you're going to bother to spend the time (I agree with Corinna that it is a purely academic exercise, of no practical value), it probably should be changed to use int_least types to be friendliest to the smallest-int targets, and it must have additional work to not be wrong on small-int targets.  For example, era as it is presently sized strictly does not work in the doomsday view, as it is only an int but can have a maximum theoretical value of about 730.7e6 (needs 30 bits)--it won't work properly for a 16-bit target.

But in addition to the theoretical size problems in the variables, there is another little-known and very real consideration that really renders those problems moot, and that is leap seconds. gmtime() is defined as using UTC, to which leap seconds are applied, so this is a real consideration, not just theoretical.  (If it were GPS time things would be different, but even with UT1 there would be issues.)  Long-time timekeeping is actually very complicated, especially if you're trying to use seconds as the basis for the measurement.  (Using days and the proleptic Gregorian calendar is an improvement of sorts over seconds, but still has problems.  If you've never heard of the proleptic Gregorian calendar, you're in the majority, but even the fact that the concept is defined underscores the problems with time reporting over longer timeframes.)

Have you done a calculation to show where the present code (without this proposed patch applied) first breaks down?

The test case checking for year 2e9 is clearly insane.  With the unpredictable nature of leap seconds, there is no way to go that far ahead and get it right.  At the recent rate--27 leap seconds since 1972--there might be as many as 1.3e9 by the year 0x7FFFFFFF. That's potentially about 41 years worth of leap seconds.  Even if the year were limited to 10000 (7983 years from now), it is potentially 4790 leap seconds.  (That's 1:20:50, which makes seconds, minutes, hours, and possibly even the day of the week, wrong.)  This is even ignoring the rate at which earth's rotation is slowing, which serves to increase the leap-second insertion rate as the years go on.

As I've quantified this somewhat in forming this reply, I'm thinking that perhaps we should just leave the code the way it is and simply add a note to the man page which points out roughly the year at which the code might break but that going even that far is introducing a leap-second-based error which is impossible to account for (either forwards a long time, or backwards past 1972 when leap seconds start for UTC).  Making changes to fix the places where the calculations can overflow is a false help, as while it addresses one issue, it does not address the larger issue that trying to use the function over such a larger time is an invalid use of it.  (A main reason for suggesting that it be left alone is to avoid penalizing small-int targets with performance penalties for a non-real help. If changes can turn out to not penalize it, cleaning up a little then would not hurt.)

Craig


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]