[PATCH] add tests for tzset(3)

Brian Inglis Brian.Inglis@SystematicSw.ab.ca
Thu Apr 14 16:31:26 GMT 2022


I am still not hearing from where the requirement originates to set 
UTC/GMT/etc or do anything other than leave everything as is.
Is this glibc behaviour, and why not /etc/localtime or /etc/timezone?

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]


On 2022-04-14 02:59, jdoubleu wrote:
> On 2022-04-13 14:33, Jeff Johnston wrote:
>> Looking at the glibc tzset code I have locally (not latest/greatest, but
>> does support angle brackets): 
> 
> I can confirm the behavior with glibc[1]. As it turns out, glibc does 
> not directly impose a character limit on the timezone name, but requires 
> at least 3 characters. From the man page[2]:
> 
>> The std string specifies an abbreviation for the timezone and must be
>> three or more alphabetic characters.
> 
> To my misunderstanding, they don't even ignore remaining characters, but 
> keep all of them, as you can see in the output[1] and Jeff Johnston 
> explained.
> 
>> but you imply that glibc in fact uses the equivalent of the scanf 
>> "%m[...]" (malloc) > modifier, and I think using that would be against 
>> the newlib 
> philosophy to keep things
>> limited and under control to support small targets.
> 
> I agree, newlib SHOULD impose a limit. Especially, since the POSIX 
> standard[3] already introduces an upper limit, though unspecified.
> 
> The current limit is 11 characters, if I'm not mistaken. The longest 
> name from the tzdb[4] is "<+1030>" i.e. 5 chars (see all extracted 
> names[5]). All others usually are 3 or 4 chars long.
> 
> That said, I think 11 is reasonably large enough.
> 
> However, it could be helpful to get the limit from user-code, because 
> there is no error reporting mechanism used. Right now, the limit is only 
> defined in tzset_r.c[6]. So maybe move it to limits.h? One thing to not 
> forget here is to keep limit in sync with the sscanf format's maximum 
> field width[7].
> 
> 
> To summarize, the following cases are errors:
> 1. name is too short (less than 3 chars)
> 2. name is too long (more than TZNAME_MAX)
> 3. name includes arbitrary chars (not <>+-ALPHANUM)
> In all of these error cases, the time should be set back to UTC, right?
> 
> 
> I'm going to prepare some test cases for the test suite to check for the 
> errors as well.
> 
> 
> [1]: https://godbolt.org/z/o93zo3qxv
> [2]: https://www.man7.org/linux/man-pages/man3/tzset.3.html
> [3]: 
> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 
> 
> [4]: https://github.com/eggert/tz
> [5]: 
> https://raw.githubusercontent.com/nayarsystems/posix_tz_db/master/zones.csv
> [6]: 
> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l13 
> 
> [7]: 
> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l57 
> 
> 
> 
> 
> Cheers
> ---
> 🙎🏻‍♂️ jdoubleu
> On 4/14/2022 12:19 AM, Brian Inglis wrote:
>> On 2022-04-13 14:33, Jeff Johnston wrote:
>>> Looking at the glibc tzset code I have locally (not latest/greatest, but
>>> does support angle brackets):
>>>
>>> If there any parse failures, UTC is defaulted.
>>
>> We currently leave the time zone info unchanged.
>>
>>> Extraneous characters inside brackets or less than 3 characters is a
>>> parse failure.
>> ✔ Check    ✔ Check
>>
>>> Glibc parses the tz name string char by char and allocates space for
>>> the name strings so there is no max size.
>>
>> The suggestion was that glibc ignores the remaining characters, but 
>> you imply that glibc in fact uses the equivalent of the scanf 
>> "%m[...]" (malloc) modifier, and I think using that would be against 
>> the newlib philosophy to keep things limited and under control to 
>> support small targets.  Larger targets like Cygwin (do our own thing 
>> including zoneinfo), and perhaps RTEMS, can supply their own 
>> enhancements.
>>
>>> the name strings so there is no max size.  I am fine if you want to 
>>> mandate a maximum, but if you do, then too many chars should be 
>>> treated as a failure.  If you aren't certain of the limit, make the 
>>> limit higher than you expect.
>> Current limits are 3-10 allowing for e.g. <MESZ+03:30> which is the 
>> most ever likely to be used. It might be reasonable to bump it up to 
>> say 15.
>>
>>> If people run into max limit with reasonable timezone format strings, 
>>> then
>>> we can up the limit.
>>
>> The conditions are more or less what is implemented, but we could do 
>> with a couple more tweaks to improve things, like check for more or 
>> extraneous chars within the bracket quotes, and that no characters 
>> remain unconsumed at the end of the parse.


More information about the Newlib mailing list