Time Fixes Diff

Craig Howland howland@LGSInnovations.com
Mon Sep 11 18:53:00 GMT 2017

On 09/10/2017 12:27 AM, Brian Inglis wrote:
> On 2017-09-08 10:34, Brian Inglis wrote:
>> On 2017-09-08 09:00, Corinna Vinschen wrote:
>>> On Sep  8 08:46, Brian Inglis wrote:
>>>> On 2017-09-08 02:18, Corinna Vinschen wrote:
>>>>> On Sep  8 07:40, Freddie Chopin wrote:
>>>>>> On Thu, 2017-09-07 at 15:58 -0600, Brian Inglis wrote:
>>>>>>> The question is, going forward, should we use long for bigger than
>>>>>>> int values as usual, or use time_t instead, to consistently avoid
>>>>>>> long, in these time functions.
>>>>>> I think we should go with time_t, which will be 32-bit or 64-bit long,
>>>>>> depending on what was chosen during newlib compilation.
>>>>> While that's nice and all, two points to keep in mind:
>>>>> * 64 bit computations are costly on 32 bit (or even 16 bit) platforms and
>>>>>    should be avoided if overkill isn't your goal.
>>>>> * days, months, years, etc, are *all* of type int in struct tm, per
>>>>>    POSIX. Adding 64 bit ops just to fold stuff back into ints might only
>>>>>    make marginal sense.
>>>> Max year == MAX_INT + 1900 == 2^31-1 + 1900 == 2147485547 on 32 bit platforms,
>>> You wrote this before but it sounds like a pretty academic problem.
>> Check out the strftime.c regression test vectors before you conclude that.
>>>> if you want to avoid arbitrary limitations, and be as (Linux) compatible as
>>>> possible, which the existing library tries to do, but there are various issues
>>>> to be fixed. There are places where int, unsigned, long, and unsigned long are
>>>> used which may or may not work depending on time_t, int, and long sizes.
>>> Glibc compat is nice, and as far as Cygwin is concerned I'm not
>>> concerned for size.  However, newlib strives to small footprints
>>> given that many embedded targets are using it.  This should be kept
>>> in mind.
>>>> Is long bigger than int on all newlib platforms? What is bigger than int?
>>>> Using time_t as required makes temps and variables big enough to avoid overflow
>>>> for seconds, days, and years counts, without penalizing 32 bit time_t platforms.
>>> Rather than int, long, long long you can use explicit sized ints
>>> like int16_t, int32_t, uint64_t.
>> We need space for things that are bigger than ints up to a few to a couple
>> orders of magnitude smaller than time_t, not even half the length, so picking
>> types smaller than time_t does not seem to be worth the code?
>>>> For I/O conversion of 64 bit time_t max year on 32 bit platforms long long is
>>>> required, it may also be used with 32 bit time_t to avoid conditionals, and it
>>>> imposes no extra overhead on 64 bit platforms.
>>>> Everything depends on how many conditionals we want to use to minimize the data
>>>> and code space and time spent at runtime and for development.
>> I'm inclined to use common code and types that will work correctly on all
>> platforms *in places where they are needed* to get things working correctly,
>> then we can worry about space and time reductions if required?
>> I can post diffs for discussion before making any local commits or patches.
> Attached for discussion.
> Used local.h definitions, removed some local duplications, changed
> tests/compares of > ... - 1) to >= ..., changed declarations to time_t and added
> casts for large secs, days, years counts to avoid overflow with extreme values,
> cast to long long for I/O mainly snprintf.
> Not yet tested due to unrelated build issues on other files in latest git pull.
If you're going to bother to spend the time (I agree with Corinna that it is a 
purely academic exercise, of no practical value), it probably should be changed 
to use int_least types to be friendliest to the smallest-int targets, and it 
must have additional work to not be wrong on small-int targets.  For example, 
era as it is presently sized strictly does not work in the doomsday view, as it 
is only an int but can have a maximum theoretical value of about 730.7e6 (needs 
30 bits)--it won't work properly for a 16-bit target.

But in addition to the theoretical size problems in the variables, there is 
another little-known and very real consideration that really renders those 
problems moot, and that is leap seconds. gmtime() is defined as using UTC, to 
which leap seconds are applied, so this is a real consideration, not just 
theoretical.  (If it were GPS time things would be different, but even with UT1 
there would be issues.)  Long-time timekeeping is actually very complicated, 
especially if you're trying to use seconds as the basis for the measurement.  
(Using days and the proleptic Gregorian calendar is an improvement of sorts over 
seconds, but still has problems.  If you've never heard of the proleptic 
Gregorian calendar, you're in the majority, but even the fact that the concept 
is defined underscores the problems with time reporting over longer timeframes.)

Have you done a calculation to show where the present code (without this 
proposed patch applied) first breaks down?

The test case checking for year 2e9 is clearly insane.  With the unpredictable 
nature of leap seconds, there is no way to go that far ahead and get it right.  
At the recent rate--27 leap seconds since 1972--there might be as many as 1.3e9 
by the year 0x7FFFFFFF. That's potentially about 41 years worth of leap 
seconds.  Even if the year were limited to 10000 (7983 years from now), it is 
potentially 4790 leap seconds.  (That's 1:20:50, which makes seconds, minutes, 
hours, and possibly even the day of the week, wrong.)  This is even ignoring the 
rate at which earth's rotation is slowing, which serves to increase the 
leap-second insertion rate as the years go on.

As I've quantified this somewhat in forming this reply, I'm thinking that 
perhaps we should just leave the code the way it is and simply add a note to the 
man page which points out roughly the year at which the code might break but 
that going even that far is introducing a leap-second-based error which is 
impossible to account for (either forwards a long time, or backwards past 1972 
when leap seconds start for UTC).  Making changes to fix the places where the 
calculations can overflow is a false help, as while it addresses one issue, it 
does not address the larger issue that trying to use the function over such a 
larger time is an invalid use of it.  (A main reason for suggesting that it be 
left alone is to avoid penalizing small-int targets with performance penalties 
for a non-real help. If changes can turn out to not penalize it, cleaning up a 
little then would not hurt.)


More information about the Newlib mailing list