Bug 4772 - strptime() doesn't support strftime()'s flags
strptime() doesn't support strftime()'s flags
Status: REOPENED
Product: glibc
Classification: Unclassified
Component: libc
unspecified
: P2 normal
: ---
Assigned To: Ulrich Drepper
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-07-10 20:46 UTC by Piotr Engelking
Modified: 2007-07-11 09:02 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
flags and field width support for strptime() (617 bytes, patch)
2007-07-10 20:51 UTC, Piotr Engelking
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Piotr Engelking 2007-07-10 20:46:01 UTC
Tomasz Kępczyński has found a bug in strptime(): its format specification
doesn't support strftime()'s flags.


Steps to reproduce:

$ cat foo.c
#define _XOPEN_SOURCE

#include <stdio.h>
#include <time.h>

int
main(void)
{
    struct tm tm;

    printf("%s\n", strptime("1", "%-d", &tm) ? "Success!" : "Woe is me!");

    return 0;
}
$ gcc foo.c && ./a.out
Woe is me!
$


As a result, gnucash misparses dates in, at least, Polish and Czech locale (see
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=243513).

The attached patch against cvs head causes strptime() to correctly parse and
interpret (i.e. ignore) flags and field width.
Comment 1 Piotr Engelking 2007-07-10 20:51:24 UTC
Created attachment 1915 [details]
flags and field width support for strptime()
Comment 2 Ulrich Drepper 2007-07-10 21:07:55 UTC
There is no requirement at all that strptime understands the same formats as
strftime.  In fact, this will never be the case.  Unless you can show that any
extension is absolutely necessary to parse a *reasonable* date string I won't
change anything.
Comment 3 Tomasz Kepczynski 2007-07-10 21:19:26 UTC
As I wrote in https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=243513
I am not convinced that this is a bug in strptime. But please note, that
strftime(..., nl_langinfo(D_FMT), ...) will print something, according to
format we define in a locale. My opnion is that we should be able to parse
this string back to internal representation using strptime (or maybe
other function) _AND_ data defined in locale definition file. The first
attempt is to use:
strptime(..., nl_langinfo(D_FMT),...)
and this fails miserably for pl_PL.UTF8 in F7.
So we either need:
1. fix in strptime
2. some oher format data in locale definition equivalent to D_FMT
   which will work for strptime
3. I am open to any suggestion here.
Please note, that strptime with %x format specifier also fails.
Comment 4 Piotr Engelking 2007-07-10 21:46:36 UTC
> There is no requirement at all that strptime understands the same formats as
> strftime.  In fact, this will never be the case.

The strptime(3) manual page says:

   "For reasons of symmetry, glibc tries to support for strptime() the same format 
    characters as for strftime()."

I find it quite reasonable. The patch is trivial and solves a real-world problem.
Comment 5 Ulrich Drepper 2007-07-10 22:28:06 UTC
The patch is not reasonable.  Ignoring the modifiers can give the wrong
impression and there might be situations where it makes a difference.

There never ever has been a promise that the locale formats are usable in
strptime.  And whatever the man page says couldn't possibly more irrelevant. 
The man pages are not written by the people who write the code.  The info pages
clearly state

The only difference is that the flags `_', `-', `0', and `^' are not allowed.


Using strptime to parse anything by untranslated strings is a gamble anyway.  If
you want to do this then prepare the strings yourself.  Get the locale format,
strip out the flags, and then pass it on to strptime.

I have not seen any argument why this should be the problem of the
implementation which at that point would expose itself to additional problems.
Comment 6 Piotr Engelking 2007-07-10 23:38:19 UTC
> The patch is not reasonable.  Ignoring the modifiers can give the wrong
> impression and there might be situations where it makes a difference.

Ignoring these modifiers matches existing behaviour. The introduced modifiers
control padding, case, and field width, all of which are already ignored by
strptime().

> There never ever has been a promise that the locale formats are usable in
> strptime.

Such a correspondence has, however, clearly been the intent. SUSv3 explicitly says:

   "Several "equivalent to" formats and the special processing of white-space
    characters are provided in order to ease the use of identical format strings
    for strftime() and strptime()."

Since the standard format specification of strptime() closely matches the
standard format specification of strftime(), it is not, in my opinion,
unreasonable to expect the same from glibc's extensions to the format.

Introducing a functionality that works _usually_, but not always, is a trap for
unwary programmer. It is, of course, easy for gnucash to workaround this
particular limitation, but the real problem is that other programs are likely to
fall victim of it as well.
Comment 7 Tomasz Kepczynski 2007-07-11 04:52:07 UTC
I see you are unconvinced, let the code speak:
----
#include <clocale>
#include <ctime>
#include <iostream>
#include <langinfo.h>

int main()
{
  time_t t;
  tm t1, t2;
  char *p, buf[128];
  setlocale(LC_ALL, "pl_PL.UTF8");
  t = time(NULL);
  t1 = *localtime(&t);
  std::cout << "nl_langinfo(D_FMT):" << nl_langinfo(D_FMT) << std::endl;
  strftime(buf, sizeof(buf), nl_langinfo(D_FMT), &t1);
  std::cout << "strftime (1): " << buf << std::endl;
  p = strptime(buf, nl_langinfo(D_FMT), &t2);
  strftime(buf, sizeof(buf), nl_langinfo(D_FMT), &t2);
  std::cout << "strftime (2): " << buf << std::endl;
  return(0);
}
----

Results from FC6:
nl_langinfo(D_FMT):%Y-%m-%d
strftime (1): 2007-07-11
strftime (2): 2007-07-11

Results from F7:
nl_langinfo(D_FMT):%-d %b %Y
strftime (1): 11 VII 2007
strftime (2): 0 II 1953

My question is: what have I done wrong?
Now, if the answer is nothing, then fix the library.
If the answer is: you haven't removed flags characters from
format string, then my answer is: fix the library.
Reasoning:
As in #6. If I have to work around something in a library
then what happens if someone decides thar strftime needs
to support flag, let's say '+'? I have to change the code
and in that case I can as well write my own fuction. This
puts us in a position where I can as well write my own library
and renders strptime completly useless and in a posistion
where I have as application developer to check ALL strptime
calls to verify that they will work.
THIS IS UNREASONABLE.
Comment 8 Ulrich Drepper 2007-07-11 04:53:41 UTC
Stop reopening the bug.  I told you that you cannot pass locale format strings
to strptime.
Comment 9 Tomasz Kepczynski 2007-07-11 09:02:58 UTC
Well, then consider this: in Polish locale strptime fails to
parse time formated with stftime with the following flags:
"%c" and "%x" (using these flags as format specifier for
strptime itself and this is documented as valid in man page).
I bet 100$ that this boils down to this problem and another
100$ that this is very close to violation of the rule you
specified in comment #8 above.
The code to prove my point:
#include <clocale>
#include <ctime>
#include <iostream>
#include <langinfo.h>

void checkfmt(const char *fmt)
{
  using std::cout;
  using std::endl;
  time_t t;
  tm t1, t2;
  char *p, buf[128];
  t = time(NULL);
  t1 = *localtime(&t);
  cout << "format string: " << fmt << endl;
  strftime(buf, sizeof(buf), fmt, &t1);
  cout << "strftime (1): " << buf << endl;
  memset(&t2, 0, sizeof(t2));
  p = strptime(buf, fmt, &t2);
  strftime(buf, sizeof(buf), fmt, &t2);
  cout << "strftime (2): " << buf << endl;
  cout << "--------------------------------------------------" << endl;
}

int main()
{
  setlocale(LC_ALL, "pl_PL.UTF8");
  checkfmt(nl_langinfo(D_FMT));
  checkfmt("%a");
  checkfmt("%A");
  checkfmt("%b");
  checkfmt("%B");
  checkfmt("%c");
  checkfmt("%x");
  checkfmt("%X");
  return(0);
}
Feel free to close if I am wrong but then we need another
bug for strptime to fix %c and %x formats.