[PATCH] add tests for tzset(3)
Brian Inglis
Brian.Inglis@SystematicSw.ab.ca
Thu Apr 14 16:31:26 GMT 2022
I am still not hearing from where the requirement originates to set
UTC/GMT/etc or do anything other than leave everything as is.
Is this glibc behaviour, and why not /etc/localtime or /etc/timezone?
--
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]
On 2022-04-14 02:59, jdoubleu wrote:
> On 2022-04-13 14:33, Jeff Johnston wrote:
>> Looking at the glibc tzset code I have locally (not latest/greatest, but
>> does support angle brackets):
>
> I can confirm the behavior with glibc[1]. As it turns out, glibc does
> not directly impose a character limit on the timezone name, but requires
> at least 3 characters. From the man page[2]:
>
>> The std string specifies an abbreviation for the timezone and must be
>> three or more alphabetic characters.
>
> To my misunderstanding, they don't even ignore remaining characters, but
> keep all of them, as you can see in the output[1] and Jeff Johnston
> explained.
>
>> but you imply that glibc in fact uses the equivalent of the scanf
>> "%m[...]" (malloc) > modifier, and I think using that would be against
>> the newlib
> philosophy to keep things
>> limited and under control to support small targets.
>
> I agree, newlib SHOULD impose a limit. Especially, since the POSIX
> standard[3] already introduces an upper limit, though unspecified.
>
> The current limit is 11 characters, if I'm not mistaken. The longest
> name from the tzdb[4] is "<+1030>" i.e. 5 chars (see all extracted
> names[5]). All others usually are 3 or 4 chars long.
>
> That said, I think 11 is reasonably large enough.
>
> However, it could be helpful to get the limit from user-code, because
> there is no error reporting mechanism used. Right now, the limit is only
> defined in tzset_r.c[6]. So maybe move it to limits.h? One thing to not
> forget here is to keep limit in sync with the sscanf format's maximum
> field width[7].
>
>
> To summarize, the following cases are errors:
> 1. name is too short (less than 3 chars)
> 2. name is too long (more than TZNAME_MAX)
> 3. name includes arbitrary chars (not <>+-ALPHANUM)
> In all of these error cases, the time should be set back to UTC, right?
>
>
> I'm going to prepare some test cases for the test suite to check for the
> errors as well.
>
>
> [1]: https://godbolt.org/z/o93zo3qxv
> [2]: https://www.man7.org/linux/man-pages/man3/tzset.3.html
> [3]:
> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03
>
> [4]: https://github.com/eggert/tz
> [5]:
> https://raw.githubusercontent.com/nayarsystems/posix_tz_db/master/zones.csv
> [6]:
> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l13
>
> [7]:
> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l57
>
>
>
>
> Cheers
> ---
> 🙎🏻♂️ jdoubleu
> On 4/14/2022 12:19 AM, Brian Inglis wrote:
>> On 2022-04-13 14:33, Jeff Johnston wrote:
>>> Looking at the glibc tzset code I have locally (not latest/greatest, but
>>> does support angle brackets):
>>>
>>> If there any parse failures, UTC is defaulted.
>>
>> We currently leave the time zone info unchanged.
>>
>>> Extraneous characters inside brackets or less than 3 characters is a
>>> parse failure.
>> ✔ Check ✔ Check
>>
>>> Glibc parses the tz name string char by char and allocates space for
>>> the name strings so there is no max size.
>>
>> The suggestion was that glibc ignores the remaining characters, but
>> you imply that glibc in fact uses the equivalent of the scanf
>> "%m[...]" (malloc) modifier, and I think using that would be against
>> the newlib philosophy to keep things limited and under control to
>> support small targets. Larger targets like Cygwin (do our own thing
>> including zoneinfo), and perhaps RTEMS, can supply their own
>> enhancements.
>>
>>> the name strings so there is no max size. I am fine if you want to
>>> mandate a maximum, but if you do, then too many chars should be
>>> treated as a failure. If you aren't certain of the limit, make the
>>> limit higher than you expect.
>> Current limits are 3-10 allowing for e.g. <MESZ+03:30> which is the
>> most ever likely to be used. It might be reasonable to bump it up to
>> say 15.
>>
>>> If people run into max limit with reasonable timezone format strings,
>>> then
>>> we can up the limit.
>>
>> The conditions are more or less what is implemented, but we could do
>> with a couple more tweaks to improve things, like check for more or
>> extraneous chars within the bracket quotes, and that no characters
>> remain unconsumed at the end of the parse.
More information about the Newlib
mailing list