[PATCH] update tzset tests

jdoubleu hi@jdoubleu.de
Tue May 17 08:45:11 GMT 2022


Sorry, here's the patch.


Cheers
---
🙎🏻‍♂️ jdoubleu
On 5/14/2022 4:39 PM, jdoubleu wrote:
>> To summarize, the following cases are errors:
>> 1. name is too short (less than 3 chars)
>> 2. name is too long (more than TZNAME_MAX)
>> 3. name includes arbitrary chars (not <>+-ALPHANUM)
>> In all of these error cases, the time should be set back to UTC, right?
> 
> Here's my patch which adds test cases for the above error cases.
> 
> It's based on the latest (pending) changes of tzset[1]. Please apply 
> them first.
> 
> I wasn't able to run the tests, yet. With glibc they're obviously failing.
> 
> Could you please test them?
> 
> 
> 
> [1]: https://sourceware.org/pipermail/newlib/2022/019581.html
> 
> 
> Cheers
> ---
> 🙎🏻‍♂️ jdoubleu
> On 4/14/2022 10:59 AM, jdoubleu wrote:
>> On 2022-04-13 14:33, Jeff Johnston wrote:
>>> Looking at the glibc tzset code I have locally (not latest/greatest, but
>>> does support angle brackets): 
>>
>> I can confirm the behavior with glibc[1]. As it turns out, glibc does 
>> not directly impose a character limit on the timezone name, but 
>> requires at least 3 characters. From the man page[2]:
>>
>>> The std string specifies an abbreviation for the timezone and must be
>>> three or more alphabetic characters.
>>
>> To my misunderstanding, they don't even ignore remaining characters, 
>> but keep all of them, as you can see in the output[1] and Jeff 
>> Johnston explained.
>>
>>> but you imply that glibc in fact uses the equivalent of the scanf 
>>> "%m[...]" (malloc) > modifier, and I think using that would be 
>>> against the newlib 
>> philosophy to keep things
>>> limited and under control to support small targets.
>>
>> I agree, newlib SHOULD impose a limit. Especially, since the POSIX 
>> standard[3] already introduces an upper limit, though unspecified.
>>
>> The current limit is 11 characters, if I'm not mistaken. The longest 
>> name from the tzdb[4] is "<+1030>" i.e. 5 chars (see all extracted 
>> names[5]). All others usually are 3 or 4 chars long.
>>
>> That said, I think 11 is reasonably large enough.
>>
>> However, it could be helpful to get the limit from user-code, because 
>> there is no error reporting mechanism used. Right now, the limit is 
>> only defined in tzset_r.c[6]. So maybe move it to limits.h? One thing 
>> to not forget here is to keep limit in sync with the sscanf format's 
>> maximum field width[7].
>>
>>
>> To summarize, the following cases are errors:
>> 1. name is too short (less than 3 chars)
>> 2. name is too long (more than TZNAME_MAX)
>> 3. name includes arbitrary chars (not <>+-ALPHANUM)
>> In all of these error cases, the time should be set back to UTC, right?
>>
>>
>> I'm going to prepare some test cases for the test suite to check for 
>> the errors as well.
>>
>>
>> [1]: https://godbolt.org/z/o93zo3qxv
>> [2]: https://www.man7.org/linux/man-pages/man3/tzset.3.html
>> [3]: 
>> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 
>>
>> [4]: https://github.com/eggert/tz
>> [5]: 
>> https://raw.githubusercontent.com/nayarsystems/posix_tz_db/master/zones.csv 
>>
>> [6]: 
>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l13 
>>
>> [7]: 
>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l57 
>>
>>
>>
>>
>> Cheers
>> ---
>> 🙎🏻‍♂️ jdoubleu
>> On 4/14/2022 12:19 AM, Brian Inglis wrote:
>>> On 2022-04-13 14:33, Jeff Johnston wrote:
>>>> Looking at the glibc tzset code I have locally (not latest/greatest, 
>>>> but
>>>> does support angle brackets):
>>>>
>>>> If there any parse failures, UTC is defaulted.
>>>
>>> We currently leave the time zone info unchanged.
>>>
>>>> Extraneous characters inside brackets or less than 3 characters is a
>>>> parse failure.
>>> ✔ Check    ✔ Check
>>>
>>>> Glibc parses the tz name string char by char and allocates space for
>>>> the name strings so there is no max size.
>>>
>>> The suggestion was that glibc ignores the remaining characters, but 
>>> you imply that glibc in fact uses the equivalent of the scanf 
>>> "%m[...]" (malloc) modifier, and I think using that would be against 
>>> the newlib philosophy to keep things limited and under control to 
>>> support small targets.  Larger targets like Cygwin (do our own thing 
>>> including zoneinfo), and perhaps RTEMS, can supply their own 
>>> enhancements.
>>>
>>>> the name strings so there is no max size.  I am fine if you want to 
>>>> mandate a maximum, but if you do, then too many chars should be 
>>>> treated as a failure.  If you aren't certain of the limit, make the 
>>>> limit higher than you expect.
>>> Current limits are 3-10 allowing for e.g. <MESZ+03:30> which is the 
>>> most ever likely to be used. It might be reasonable to bump it up to 
>>> say 15.
>>>
>>>> If people run into max limit with reasonable timezone format 
>>>> strings, then
>>>> we can up the limit.
>>>
>>> The conditions are more or less what is implemented, but we could do 
>>> with a couple more tweaks to improve things, like check for more or 
>>> extraneous chars within the bracket quotes, and that no characters 
>>> remain unconsumed at the end of the parse.
-------------- next part --------------
From 4423c43be6a730144b776ba4ec4204cf71b52348 Mon Sep 17 00:00:00 2001
From: jdoubleu <hi@jdoubleu.de>
Date: Sat, 14 May 2022 15:41:22 +0200
Subject: [PATCH] update tzset tests

Add test cases for parser errors after reworked parsing behavior.
---
 newlib/testsuite/newlib.time/tzset.c | 56 ++++++++++++++++++++++------
 1 file changed, 44 insertions(+), 12 deletions(-)

diff --git a/newlib/testsuite/newlib.time/tzset.c b/newlib/testsuite/newlib.time/tzset.c
index 0e5b196c6..8702b58db 100644
--- a/newlib/testsuite/newlib.time/tzset.c
+++ b/newlib/testsuite/newlib.time/tzset.c
@@ -111,13 +111,43 @@ struct tz_test test_timezones[] = {
     { /* Asia/Colombo */            "<+0530>-5:30",                    -IN_SECONDS(5, 30, 0),     NO_TIME},
     { /* Europe/Berlin */           "CET-1CEST,M3.5.0,M10.5.0/3",      -IN_SECONDS(1, 0, 0),     -IN_SECONDS(2, 0, 0)},
 
-    // END of list
-    {NULL, NO_TIME, NO_TIME}
+    /// test parsing errors
+    // 1. names are too long
+    {"JUSTEXCEEDI1:11:11",                                      0,   NO_TIME},
+    {"AVERYLONGNAMEWHICHEXCEEDSTZNAMEMAX2:22:22",               0,   NO_TIME},
+    {"FIRSTVERYLONGNAME3:33:33SECONDVERYLONGNAME4:44:44",       0,   0},
+    {"<JUSTEXCEEDI>5:55:55",                                    0,   NO_TIME},
+    {"<FIRSTVERYLONGNAME>3:33:33<SECONDVERYLONGNAME>4:44:44",   0,   0},
+    {"<+JUSTEXCEED>5:55:55",                                    0,   NO_TIME},
+
+    // 2. names are too short
+    {"JU6:34:47",               0,   NO_TIME},
+    {"HE6:34:47LO3:34:47",      0,   0},
+    {"<AB>2:34:47",             0,   NO_TIME},
+    {"<AB>2:34:47<CD>3:34:47",  0,   0},
+    
+    // 3. names contain invalid chars
+    {"N?ME2:10:56",     0,   NO_TIME},
+    {"N!ME2:10:56",     0,   NO_TIME},
+    {"N/ME2:10:56",     0,   NO_TIME},
+    {"N$ME2:10:56",     0,   NO_TIME},
+    {"NAME?2:10:56",    0,   NO_TIME},
+    {"?NAME2:10:56",    0,   NO_TIME},
+    {"NAME?UNK4:21:15",                 0,   NO_TIME},
+    {"NAME!UNK4:22:15NEXT/NAME4:23:15", 0,   NO_TIME},
+
+    // 4. bogus strings
+    {"NOINFO",          0,  NO_TIME},
+    {"HOUR:16:18",      0,  NO_TIME},
+    {"<BEGIN",          0,  NO_TIME},
+    {"<NEXT:55",        0,  NO_TIME},
+    {">WRONG<2:15:00",  0,  NO_TIME},
+    {"ST<ART4:30:00",   0,  NO_TIME},
+    //{"MANY8:00:00:00",  0,  NO_TIME},
+    {"\0",              0,  NO_TIME},
+    {"M\0STR7:30:36",   0,  NO_TIME}
 };
 
-// helper macros
-#define FOR_TIMEZONES(iter_name) for (struct tz_test* iter_name = test_timezones; iter_name->tzstr != NULL; ++iter_name)
-
 // END test vectors
 
 static int failed = 0;
@@ -136,22 +166,24 @@ void test_TimezoneStrings(void)
 {
     char buffer[128];
 
-    FOR_TIMEZONES(ptr)
+    for (int i = 0; i < (sizeof(test_timezones) / sizeof(struct tz_test)); ++i)
     {
-        setenv("TZ", ptr->tzstr, 1);
+        struct tz_test ptr = test_timezones[i];
+
+        setenv("TZ", ptr.tzstr, 1);
         tzset();
 
-        snprintf(buffer, 128, "winter time, timezone = \"%s\"", ptr->tzstr);
+        snprintf(buffer, 128, "winter time, timezone = \"%s\"", ptr.tzstr);
 
         struct tm winter_tm_copy = winter_tm; // copy
-        TEST_ASSERT_EQUAL_INT_MESSAGE(winter_time + ptr->offset_seconds, mktime(&winter_tm_copy), buffer);
+        TEST_ASSERT_EQUAL_INT_MESSAGE(winter_time + ptr.offset_seconds, mktime(&winter_tm_copy), buffer);
 
-        if (ptr->dst_offset_seconds != NO_TIME)
+        if (ptr.dst_offset_seconds != NO_TIME)
         {
-            snprintf(buffer, 128, "summer time, timezone = \"%s\"", ptr->tzstr);
+            snprintf(buffer, 128, "summer time, timezone = \"%s\"", ptr.tzstr);
 
             struct tm summer_tm_copy = summer_tm; // copy
-            TEST_ASSERT_EQUAL_INT_MESSAGE(summer_time + ptr->dst_offset_seconds, mktime(&summer_tm_copy), buffer);
+            TEST_ASSERT_EQUAL_INT_MESSAGE(summer_time + ptr.dst_offset_seconds, mktime(&summer_tm_copy), buffer);
         }
     }
 }
-- 
2.35.1



More information about the Newlib mailing list