Sources Bugzilla – Bug 4772
strptime() doesn't support strftime()'s flags
Last modified: 2007-07-11 09:02:58 UTC
Tomasz Kępczyński has found a bug in strptime(): its format specification doesn't support strftime()'s flags. Steps to reproduce: $ cat foo.c #define _XOPEN_SOURCE #include <stdio.h> #include <time.h> int main(void) { struct tm tm; printf("%s\n", strptime("1", "%-d", &tm) ? "Success!" : "Woe is me!"); return 0; } $ gcc foo.c && ./a.out Woe is me! $ As a result, gnucash misparses dates in, at least, Polish and Czech locale (see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=243513). The attached patch against cvs head causes strptime() to correctly parse and interpret (i.e. ignore) flags and field width.
Created attachment 1915 [details] flags and field width support for strptime()
There is no requirement at all that strptime understands the same formats as strftime. In fact, this will never be the case. Unless you can show that any extension is absolutely necessary to parse a *reasonable* date string I won't change anything.
As I wrote in https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=243513 I am not convinced that this is a bug in strptime. But please note, that strftime(..., nl_langinfo(D_FMT), ...) will print something, according to format we define in a locale. My opnion is that we should be able to parse this string back to internal representation using strptime (or maybe other function) _AND_ data defined in locale definition file. The first attempt is to use: strptime(..., nl_langinfo(D_FMT),...) and this fails miserably for pl_PL.UTF8 in F7. So we either need: 1. fix in strptime 2. some oher format data in locale definition equivalent to D_FMT which will work for strptime 3. I am open to any suggestion here. Please note, that strptime with %x format specifier also fails.
> There is no requirement at all that strptime understands the same formats as > strftime. In fact, this will never be the case. The strptime(3) manual page says: "For reasons of symmetry, glibc tries to support for strptime() the same format characters as for strftime()." I find it quite reasonable. The patch is trivial and solves a real-world problem.
The patch is not reasonable. Ignoring the modifiers can give the wrong impression and there might be situations where it makes a difference. There never ever has been a promise that the locale formats are usable in strptime. And whatever the man page says couldn't possibly more irrelevant. The man pages are not written by the people who write the code. The info pages clearly state The only difference is that the flags `_', `-', `0', and `^' are not allowed. Using strptime to parse anything by untranslated strings is a gamble anyway. If you want to do this then prepare the strings yourself. Get the locale format, strip out the flags, and then pass it on to strptime. I have not seen any argument why this should be the problem of the implementation which at that point would expose itself to additional problems.
> The patch is not reasonable. Ignoring the modifiers can give the wrong > impression and there might be situations where it makes a difference. Ignoring these modifiers matches existing behaviour. The introduced modifiers control padding, case, and field width, all of which are already ignored by strptime(). > There never ever has been a promise that the locale formats are usable in > strptime. Such a correspondence has, however, clearly been the intent. SUSv3 explicitly says: "Several "equivalent to" formats and the special processing of white-space characters are provided in order to ease the use of identical format strings for strftime() and strptime()." Since the standard format specification of strptime() closely matches the standard format specification of strftime(), it is not, in my opinion, unreasonable to expect the same from glibc's extensions to the format. Introducing a functionality that works _usually_, but not always, is a trap for unwary programmer. It is, of course, easy for gnucash to workaround this particular limitation, but the real problem is that other programs are likely to fall victim of it as well.
I see you are unconvinced, let the code speak: ---- #include <clocale> #include <ctime> #include <iostream> #include <langinfo.h> int main() { time_t t; tm t1, t2; char *p, buf[128]; setlocale(LC_ALL, "pl_PL.UTF8"); t = time(NULL); t1 = *localtime(&t); std::cout << "nl_langinfo(D_FMT):" << nl_langinfo(D_FMT) << std::endl; strftime(buf, sizeof(buf), nl_langinfo(D_FMT), &t1); std::cout << "strftime (1): " << buf << std::endl; p = strptime(buf, nl_langinfo(D_FMT), &t2); strftime(buf, sizeof(buf), nl_langinfo(D_FMT), &t2); std::cout << "strftime (2): " << buf << std::endl; return(0); } ---- Results from FC6: nl_langinfo(D_FMT):%Y-%m-%d strftime (1): 2007-07-11 strftime (2): 2007-07-11 Results from F7: nl_langinfo(D_FMT):%-d %b %Y strftime (1): 11 VII 2007 strftime (2): 0 II 1953 My question is: what have I done wrong? Now, if the answer is nothing, then fix the library. If the answer is: you haven't removed flags characters from format string, then my answer is: fix the library. Reasoning: As in #6. If I have to work around something in a library then what happens if someone decides thar strftime needs to support flag, let's say '+'? I have to change the code and in that case I can as well write my own fuction. This puts us in a position where I can as well write my own library and renders strptime completly useless and in a posistion where I have as application developer to check ALL strptime calls to verify that they will work. THIS IS UNREASONABLE.
Stop reopening the bug. I told you that you cannot pass locale format strings to strptime.
Well, then consider this: in Polish locale strptime fails to parse time formated with stftime with the following flags: "%c" and "%x" (using these flags as format specifier for strptime itself and this is documented as valid in man page). I bet 100$ that this boils down to this problem and another 100$ that this is very close to violation of the rule you specified in comment #8 above. The code to prove my point: #include <clocale> #include <ctime> #include <iostream> #include <langinfo.h> void checkfmt(const char *fmt) { using std::cout; using std::endl; time_t t; tm t1, t2; char *p, buf[128]; t = time(NULL); t1 = *localtime(&t); cout << "format string: " << fmt << endl; strftime(buf, sizeof(buf), fmt, &t1); cout << "strftime (1): " << buf << endl; memset(&t2, 0, sizeof(t2)); p = strptime(buf, fmt, &t2); strftime(buf, sizeof(buf), fmt, &t2); cout << "strftime (2): " << buf << endl; cout << "--------------------------------------------------" << endl; } int main() { setlocale(LC_ALL, "pl_PL.UTF8"); checkfmt(nl_langinfo(D_FMT)); checkfmt("%a"); checkfmt("%A"); checkfmt("%b"); checkfmt("%B"); checkfmt("%c"); checkfmt("%x"); checkfmt("%X"); return(0); } Feel free to close if I am wrong but then we need another bug for strptime to fix %c and %x formats.