This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v4] [BZ #16141] strptime %z: fix rounding, extend range to +/-9959
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: James Perkins <james at loowit dot net>, GNU libc <libc-alpha at sourceware dot org>, Mike Frysinger <vapier at gentoo dot org>, Paul Eggert <eggert at cs dot ucla dot edu>, Will Newton <will dot newton at linaro dot org>
- Date: Thu, 13 Aug 2015 12:45:22 -0400
- Subject: Re: [PATCH v4] [BZ #16141] strptime %z: fix rounding, extend range to +/-9959
- Authentication-results: sourceware.org; auth=none
- References: <1439339237-22204-1-git-send-email-james at loowit dot net>
On 08/11/2015 08:27 PM, James Perkins wrote:
> Fixes [BZ #16141] strptime %z offset restriction.
>
> Topic: strptime supports a %z input field descriptor, which parses a
> time zone offset from UTC time into the broken-out time field tm_gmtoff.
Overall the patch looks great. The high-level idea is correct.
The architecture of the changes is almost right, I think though that we
need a more thorough test case that covers every value in +HHMM to -HHMM
(skipping the invalid ones) to make sure we never ever break this again
when someone tries to optimize the integer math.
After we fix the testing we can do a thorough polish of any final details.
> Problems:
>
> 1) In the current implementation, the minutes portion calculation is
> correct only for minutes evenly divisible by 3. This is because the
> minutes value is converted to decimal time, but inadequate precision
> leads to rounding which calculates results that are too low for
> some values.
Agreed, the 50/30 value is not sufficiently accurate.
> For example, due to rounding, a +1159 offset string results in an
> incorrect tm_gmtoff of 43128 (== 11 * 3600 + 58.8 * 60) seconds,
> instead of 43140 (== 11 * 3600 + 59 * 60) seconds. In contrast,
> a +1157 offset (minutes divisible by 3) does not cause the bug,
> and results in a correct tm_gmtoff of 43020.
Agreed.
> 2) strptime's %z specifier will not parse time offsets less than
> -1200 or greater than +1200, or if only hour digits are present, less
> than -12 or greater than +12. It will return NULL for offsets outside
> that range. These limits do not meet historical and modern use cases:
>
> * Present day exceeds the +1200 limit:
> - Pacific/Auckland (New Zealand) summer time is +1300.
> - Pacific/Kiritimati (Christmas Island) is +1400.
> - Pacific/Apia (Samoa) summer time is +1400.
Agreed.
> * Historical offsets exceeded +1500/-1500.
Agreed.
> * POSIX supports -2459 to +2559.
Agreed, but note that POSIX has no %z support in strptime, and this
is a glibc extension to support %z in strptime (like it does in strftime).
> * Offsets up to +/-9959 may occasionally be useful.
Agreed.
> * Paul Eggert's notes provide additional detail:
> - https://sourceware.org/ml/libc-alpha/2014-12/msg00068.html
> - https://sourceware.org/ml/libc-alpha/2014-12/msg00072.html
OK.
>
> 3) tst-strptime2, part of the 'make check' test suite, does not test
> for the above problems.
As mentioned above, this needs to be more thorough.
> Corrective actions:
>
> 1) In time/strptime_l.c, calculate the offset from the hour and
> minute portions directly, without the rounding errors introduced by
> decimal time.
OK.
> 2) Remove the +/-1200 range limit, permitting strptime to parse offsets
> from -9959 through +9959.
OK.
> 3) Add zone offset values to time/tst-strptime2.c.
>
> * Test minutes evenly divisible by three (+1157) and not evenly
> divisible by three (+1158 and +1159).
> * Test offsets near the old and new range limits (-1201, -1330, -2459,
> -2500, -99, -9959, +1201, +1330, +1400, +1401, +2559, +2600, +99,
> and +9959)
Not enough IMO. We might break this again in the future. I'd like to see
a thorough iteration of the values.
> The revised strptime passes all old and new tst-strptime2 tests.
Good thanks for clarifying that.
> 2015-08-11 James Perkins <james@loowit.net>
>
> [BZ #16141]
> * time/strptime_l.c (__strptime_internal): Fix %z minutes
> calculation, removing incorrect decimal time rounding, so that
> all minute values result in a valid seconds value.
> * time/strptime_l.c (__strptime_internal): Extend %z time zone
> offset range limits to UTC-99:59 through UTC+99:59 to parse
> current and historical use cases.
> * time/strptime2.c (tests): Modify and add tests for the
> strptime %z input field descriptor, specifically conversion of
> minutes to seconds and validating an offset range of -9959 to
> +9959.
>
> Signed-off-by: James Perkins <james@loowit.net>
> ---
> time/strptime_l.c | 12 +++---------
> time/tst-strptime2.c | 21 +++++++++++++++++++--
> 2 files changed, 22 insertions(+), 11 deletions(-)
>
> diff --git a/time/strptime_l.c b/time/strptime_l.c
> index 5640cce..4203ad8 100644
> --- a/time/strptime_l.c
> +++ b/time/strptime_l.c
> @@ -770,16 +770,10 @@ __strptime_internal (rp, fmt, tmp, statep LOCALE_PARAM)
> else if (n != 4)
> /* Only two or four digits recognized. */
> return NULL;
> - else
> - {
> - /* We have to convert the minutes into decimal. */
> - if (val % 100 >= 60)
> - return NULL;
> - val = (val / 100) * 100 + ((val % 100) * 50) / 30;
OK, I agree that 50/30 does result in a rounding error.
> - }
> - if (val > 1200)
> + else if (val % 100 >= 60)
> + /* Minutes valid range is 0 through 59. */
> return NULL;
OK.
> - tm->tm_gmtoff = (val * 3600) / 100;
> + tm->tm_gmtoff = (val / 100) * 3600 + (val % 100) * 60;
OK.
I created a small test case to isolate this code and validated it for
all values that were valid and your change matches the expected values
computed using double precision.
> if (neg)
> tm->tm_gmtoff = -tm->tm_gmtoff;
> }
> diff --git a/time/tst-strptime2.c b/time/tst-strptime2.c
> index a08e6d7..4db8321 100644
> --- a/time/tst-strptime2.c
> +++ b/time/tst-strptime2.c
> @@ -17,8 +17,25 @@ static const struct
> { "1113472456 -1030", -37800 },
> { "1113472456 +0030", 1800 },
> { "1113472456 -0030", -1800 },
> - { "1113472456 -1330", LONG_MAX },
> - { "1113472456 +1330", LONG_MAX },
> + { "1113472456 +1157", 43020 },
> + { "1113472456 +1158", 43080 },
> + { "1113472456 +1159", 43140 },
> + { "1113472456 +1200", 43200 },
> + { "1113472456 -1200", -43200 },
> + { "1113472456 +1201", 43260 },
> + { "1113472456 -1201", -43260 },
> + { "1113472456 +1330", 48600 },
> + { "1113472456 -1330", -48600 },
> + { "1113472456 +1400", 50400 },
> + { "1113472456 +1401", 50460 },
> + { "1113472456 -2459", -89940 },
> + { "1113472456 -2500", -90000 },
> + { "1113472456 +2559", 93540 },
> + { "1113472456 +2600", 93600 },
> + { "1113472456 -99", -356400 },
> + { "1113472456 +99", 356400 },
> + { "1113472456 -9959", -359940 },
> + { "1113472456 +9959", 359940 },
> { "1113472456 -1060", LONG_MAX },
> { "1113472456 +1060", LONG_MAX },
> { "1113472456 1030", LONG_MAX },
This test needs to be extended to cover all somehow. I don't want to
come back to this in 5 years when it breaks again.
Is that asking too much?
Cheers,
Carlos.