This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v4] Improve the width of alternate representation for year in strftime [BZ #23758]
- From: Carlos O'Donell <carlos at redhat dot com>
- To: TAMUKI Shoichi <tamuki at linet dot gr dot jp>, libc-alpha at sourceware dot org
- Date: Thu, 3 Jan 2019 09:43:21 -0500
- Subject: Re: [PATCH v4] Improve the width of alternate representation for year in strftime [BZ #23758]
- References: <201812070655.AA04129@tamuki.linet.gr.jp>
On 12/7/18 1:55 AM, TAMUKI Shoichi wrote:
> The Japanese era name is scheduled to be changed on May 1, 2019.
> Prior to this, change the alternate representation for year in
> strftime to pad the number with zero to keep it constant width, so
> that prevent the trouble we saw in the past from becoming obvious
> again from the year after the era name changes onward.
>
> Since only one Japanese era name is used by each emperor's reign as
> lately, it is rare that the year ends in one digit or lasts more than
> three digits. In addition, the default width of month, day, hour,
> minute, and second is 2, so adjust the default width of year the same
> as them, and then the whole display balance is improved. Therefore,
> it would be reasonable to change the width padding with zero of %Ey
> default to 2.
>
> Currently in glibc, besides ja_JP locale, the locales using the
> conversion specifier above are lo_LA (Laos) and th_TH (Thailand). In
> these locales, they use the Buddhist era. The Buddhist era is a value
> obtained by adding 543 to the Christian era, so they are not affected
> by the change of the conversion specifier %Ey.
>
> Furthermore, for the output string of the conversion specifier %EY,
> an optional flag is given to the conversion specifier so that it can
> be also used the current non-padding format, and the padding format
> can be controlled. To achieve this, when an optional flag is given to
> the conversion specifier %EY, the %Ey included in the combined
> conversion specifier is interpreted as if decorated with the
> appropriate flag.
Please split this patch into 2 patches as soon as possible.
The fix for "%_EY"/"%-EY" should be reviewed separately as a bug fix,
and can go into master even if we are in a frozen state (we are currently
frozen for 2.29 release).
I want to review these changes in parallel so we can get them into 2.29
and be prepared for the Era name change.
Cheers,
Carlos.
> ChangeLog:
>
> [BZ #23758]
> * NEWS: Mention the change.
> * manual/time.texi (strftime): Document the desctiption for %EC, %Ey,
> and %EY.
> * time/Makefile (tests): Add tst-strftime2.
> * time/strftime_l.c (__strftime_internal): Add argument yr_spec to
> override padding for %Ey.
> Change the width padding with zero of %Ey default to 2. If an
> optional flag ('_' or '-') is specified to %EY, the %Ey in subformat
> is interpreted as if decorated with the appropriate flag.
> * time/tst-strftime2.c: New file.
> ---
> Changes since v3:
> - use support/test-driver.c instead of the old-style test-skeleton.c
> in tst-strftime2.c, also correct the copyright year
> - slightly modify the documentation including commit message
>
> Remarks:
> At last, I have fulfilled all legal requirements, so you can review
> my patch without worrying about the copyright issues.
>
> In this patch, all items pointed out so far have been fixed.
>
> My concern is the documentation. Please review especially wording,
> expression, etc. (I am not an English native speaker.)
>
> NEWS | 7 +++
> manual/time.texi | 15 ++++++
> time/Makefile | 2 +-
> time/strftime_l.c | 28 ++++++-----
> time/tst-strftime2.c | 133 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 173 insertions(+), 12 deletions(-)
> create mode 100644 time/tst-strftime2.c
>
> diff --git a/NEWS b/NEWS
> index 8483dcf4928..40e33e228f8 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -41,6 +41,13 @@ Major new features:
> incosistent mutex state after fork call in multithread environment.
> In both popen and system there is no direct access to user-defined mutexes.
>
> +* Improve the width of alternate representation for year in strftime.
> + For %Ey conversion specifier, the default action is now to pad the
> + number with zero to keep minimum 2 digits as well as %y. Also, the
> + optional flag (either _ or -) can be used for %EY, so that the
> + internal %Ey is interpreted as if decorated with the appropriate
> + flag.
> +
> Deprecated and removed features, and other changes affecting compatibility:
>
> * The glibc.tune tunable namespace has been renamed to glibc.cpu and the
> diff --git a/manual/time.texi b/manual/time.texi
> index 4d154452eba..c1d5770f9e0 100644
> --- a/manual/time.texi
> +++ b/manual/time.texi
> @@ -1393,6 +1393,9 @@ The preferred calendar time representation for the current locale.
> The century of the year. This is equivalent to the greatest integer not
> greater than the year divided by 100.
>
> +If the @code{E} modifier is specified (@code{%EC}), the locale's
> +alternate representation for year (the era name) is used instead.
> +
> This format was first standardized by POSIX.2-1992 and by @w{ISO C99}.
>
> @item %d
> @@ -1568,10 +1571,22 @@ The preferred time of day representation for the current locale.
> The year without a century as a decimal number (range @code{00} through
> @code{99}). This is equivalent to the year modulo 100.
>
> +If the @code{E} modifier is specified (@code{%Ey}), the locale's
> +alternate representation for year (the era year) is used instead. The
> +default action is to pad the number with zero to keep minimum 2 digits
> +as well as @code{%y}.
> +
> @item %Y
> The year as a decimal number, using the Gregorian calendar. Years
> before the year @code{1} are numbered @code{0}, @code{-1}, and so on.
>
> +If the @code{E} modifier is specified (@code{%EY}), the locale's
> +alternate representation for year (generally the combination of
> +@code{%EC} and @code{%Ey}) is used instead. In this case, the
> +optional flag (either @code{_} or @code{-}) can be used, so that the
> +internal @code{%Ey} is interpreted as if decorated with the
> +appropriate flag.
> +
> @item %z
> @w{RFC 822}/@w{ISO 8601:1988} style numeric time zone (e.g.,
> @code{-0600} or @code{+0100}), or nothing if no time zone is
> diff --git a/time/Makefile b/time/Makefile
> index 743bd99f182..9f94e07e3bc 100644
> --- a/time/Makefile
> +++ b/time/Makefile
> @@ -43,7 +43,7 @@ tests := test_time clocktest tst-posixtz tst-strptime tst_wcsftime \
> tst-getdate tst-mktime tst-mktime2 tst-ftime_l tst-strftime \
> tst-mktime3 tst-strptime2 bug-asctime bug-asctime_r bug-mktime1 \
> tst-strptime3 bug-getdate1 tst-strptime-whitespace tst-ftime \
> - tst-tzname tst-y2039 bug-mktime4
> + tst-tzname tst-y2039 bug-mktime4 tst-strftime2
>
> include ../Rules
>
> diff --git a/time/strftime_l.c b/time/strftime_l.c
> index c71f9f47a95..8797341ea53 100644
> --- a/time/strftime_l.c
> +++ b/time/strftime_l.c
> @@ -434,7 +434,7 @@ static CHAR_T const month_name[][10] =
> #endif
>
> static size_t __strftime_internal (CHAR_T *, size_t, const CHAR_T *,
> - const struct tm *, bool *
> + const struct tm *, int *, bool *
> ut_argument_spec
> LOCALE_PARAM) __THROW;
>
> @@ -456,8 +456,9 @@ my_strftime (CHAR_T *s, size_t maxsize, const CHAR_T *format,
> tmcopy = *tp;
> tp = &tmcopy;
> #endif
> + int yr_spec = 0; /* Override padding for %Ey. */
> bool tzset_called = false;
> - return __strftime_internal (s, maxsize, format, tp, &tzset_called
> + return __strftime_internal (s, maxsize, format, tp, &yr_spec, &tzset_called
> ut_argument LOCALE_ARG);
> }
> #ifdef _LIBC
> @@ -466,7 +467,7 @@ libc_hidden_def (my_strftime)
>
> static size_t
> __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
> - const struct tm *tp, bool *tzset_called
> + const struct tm *tp, int *yr_spec, bool *tzset_called
> ut_argument_spec LOCALE_PARAM)
> {
> #if defined _LIBC && defined USE_IN_EXTENDED_LOCALE_MODEL
> @@ -820,7 +821,7 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
> if (modifier == L_('O'))
> goto bad_format;
> #ifdef _NL_CURRENT
> - if (! (modifier == 'E'
> + if (! (modifier == L_('E')
> && (*(subfmt =
> (const CHAR_T *) _NL_CURRENT (LC_TIME,
> NLW(ERA_D_T_FMT)))
> @@ -838,11 +839,12 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
> {
> CHAR_T *old_start = p;
> size_t len = __strftime_internal (NULL, (size_t) -1, subfmt,
> - tp, tzset_called ut_argument
> - LOCALE_ARG);
> + tp, yr_spec, tzset_called
> + ut_argument LOCALE_ARG);
> add (len, __strftime_internal (p, maxsize - i, subfmt,
> - tp, tzset_called ut_argument
> - LOCALE_ARG));
> + tp, yr_spec, tzset_called
> + ut_argument LOCALE_ARG));
> + *yr_spec = 0;
>
> if (to_uppcase)
> while (old_start < p)
> @@ -917,7 +919,7 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
> #ifdef _NL_CURRENT
> if (! (modifier == L_('E')
> && (*(subfmt =
> - (const CHAR_T *)_NL_CURRENT (LC_TIME, NLW(ERA_D_FMT)))
> + (const CHAR_T *) _NL_CURRENT (LC_TIME, NLW(ERA_D_FMT)))
> != L_('\0'))))
> subfmt = (const CHAR_T *) _NL_CURRENT (LC_TIME, NLW(D_FMT));
> goto subformat;
> @@ -1262,7 +1264,7 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
> DO_NUMBER (1, tp->tm_wday);
>
> case L_('Y'):
> - if (modifier == 'E')
> + if (modifier == L_('E'))
> {
> #if HAVE_STRUCT_ERA_ENTRY
> struct era_entry *era = _nl_get_era_entry (tp HELPER_LOCALE_ARG);
> @@ -1273,6 +1275,8 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
> # else
> subfmt = era->era_format;
> # endif
> + if (pad != 0)
> + *yr_spec = pad;
> goto subformat;
> }
> #else
> @@ -1294,7 +1298,9 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
> if (era)
> {
> int delta = tp->tm_year - era->start_date[0];
> - DO_NUMBER (1, (era->offset
> + if (*yr_spec != 0)
> + pad = *yr_spec;
> + DO_NUMBER (2, (era->offset
> + delta * era->absolute_direction));
> }
> #else
> diff --git a/time/tst-strftime2.c b/time/tst-strftime2.c
> new file mode 100644
> index 00000000000..6f2cf4ddd81
> --- /dev/null
> +++ b/time/tst-strftime2.c
> @@ -0,0 +1,133 @@
> +/* Verify the behavior of strftime on alternate representation for year.
> +
> + Copyright (C) 2018 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#include <locale.h>
> +#include <time.h>
> +#include <stdio.h>
> +#include <string.h>
> +
> +static const char *locales[] = { "ja_JP.UTF-8", "lo_LA.UTF-8", "th_TH.UTF-8" };
> +#define nlocales (sizeof (locales) / sizeof (locales[0]))
> +
> +static const char *formats[] = { "%EY", "%_EY", "%-EY" };
> +#define nformats (sizeof (formats) / sizeof (formats[0]))
> +
> +static const struct
> +{
> + const int d, m, y;
> +} dates[] =
> + {
> + { 1, 3, 88 },
> + { 7, 0, 89 },
> + { 8, 0, 89 },
> + { 1, 3, 90 },
> + { 1, 3, 97 },
> + { 1, 3, 98 }
> + };
> +#define ndates (sizeof (dates) / sizeof (dates[0]))
> +
> +static char ref[nlocales][nformats][ndates][100];
> +
> +static void
> +mkreftable (void)
> +{
> + int i, j, k;
> + char era[10];
> + static const int yrj[] = { 63, 64, 1, 2, 9, 10 };
> + static const int yrb[] = { 2531, 2532, 2532, 2533, 2540, 2541 };
> +
> + for (i = 0; i < nlocales; i++)
> + for (j = 0; j < nformats; j++)
> + for (k = 0; k < ndates; k++)
> + {
> + if (i == 0)
> + {
> + sprintf (era, "%s", (k < 2) ? "\xe6\x98\xad\xe5\x92\x8c"
> + : "\xe5\xb9\xb3\xe6\x88\x90");
> + if (yrj[k] == 1)
> + sprintf (ref[i][j][k], "%s\xe5\x85\x83\xe5\xb9\xb4", era);
> + else
> + {
> + if (j == 0)
> + sprintf (ref[i][j][k], "%s%02d\xe5\xb9\xb4", era, yrj[k]);
> + else if (j == 1)
> + sprintf (ref[i][j][k], "%s%2d\xe5\xb9\xb4", era, yrj[k]);
> + else
> + sprintf (ref[i][j][k], "%s%d\xe5\xb9\xb4", era, yrj[k]);
> + }
> + }
> + else if (i == 1)
> + {
> + sprintf (era, "\xe0\xba\x9e\x2e\xe0\xba\xaa\x2e ");
> + sprintf (ref[i][j][k], "%s%d", era, yrb[k]);
> + }
> + else
> + {
> + sprintf (era, "\xe0\xb8\x9e\x2e\xe0\xb8\xa8\x2e ");
> + sprintf (ref[i][j][k], "%s%d", era, yrb[k]);
> + }
> + }
> +}
> +
> +static int
> +do_test (void)
> +{
> + int i, j, k, result = 0;
> + struct tm ttm;
> + char date[11], buf[100];
> + size_t r, e;
> +
> + mkreftable ();
> + for (i = 0; i < nlocales; i++)
> + {
> + if (setlocale (LC_ALL, locales[i]) == NULL)
> + {
> + printf ("locale %s does not exist, skipping...\n", locales[i]);
> + continue;
> + }
> + printf ("[%s]\n", locales[i]);
> + for (j = 0; j < nformats; j++)
> + {
> + for (k = 0; k < ndates; k++)
> + {
> + ttm.tm_mday = dates[k].d;
> + ttm.tm_mon = dates[k].m;
> + ttm.tm_year = dates[k].y;
> + strftime (date, sizeof (date), "%F", &ttm);
> + r = strftime (buf, sizeof (buf), formats[j], &ttm);
> + e = strlen (ref[i][j][k]);
> + printf ("%s\t\"%s\"\t\"%s\"", date, formats[j], buf);
> + if (strcmp (buf, ref[i][j][k]) != 0)
> + {
> + printf ("\tshould be \"%s\"", ref[i][j][k]);
> + if (r != e)
> + printf ("\tgot: %zu, expected: %zu", r, e);
> + result = 1;
> + }
> + else
> + printf ("\tOK");
> + putchar ('\n');
> + }
> + putchar ('\n');
> + }
> + }
> + return result;
> +}
> +
> +#include <support/test-driver.c>
>
--
Cheers,
Carlos.