This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][BZ 19084] Make sure _nl_value_type_LC_<category> arrays have correct size.


Thank you for your reply, Mike. Please sese my answers below:

> On 21 Mar 2016 23:59 Mike Frysinger <vapier@gentoo.org> wrote:
>
> [...]
> sorry, but i'm not seeing the problem here. the type field is merely
> an enum. whether the last element is a stringarray (value of 2) or a
> byte (value of 3), the array will still build up the same # of elements.
> [...]
>
> the code today will do for LC_IDENTIFICATION:
> static const enum value_type _nl_value_type_LC_IDENTIFICATION[] = {
> [((int) (_NL_IDENTIFICATION_TITLE) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_SOURCE) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_ADDRESS) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_CONTACT) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_EMAIL) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_TEL) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_FAX) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_LANGUAGE) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_TERRITORY) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_AUDIENCE) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_APPLICATION) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_ABBREVIATION) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_REVISION) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_DATE) & 0xffff)] = string,
> [((int) (_NL_IDENTIFICATION_CATEGORY) & 0xffff)] = stringarray,
> [((int) (_NL_IDENTIFICATION_CODESET) & 0xffff)] = string,
> };
>
> these NL values are defined in langinfo.h and expand as:
> _NL_IDENTIFICATION_TITLE = (((12) << 16) | (0)),
> _NL_IDENTIFICATION_SOURCE,
> _NL_IDENTIFICATION_ADDRESS,
> _NL_IDENTIFICATION_CONTACT,
> _NL_IDENTIFICATION_EMAIL,
> _NL_IDENTIFICATION_TEL,
> _NL_IDENTIFICATION_FAX,
> _NL_IDENTIFICATION_LANGUAGE,
> _NL_IDENTIFICATION_TERRITORY,
> _NL_IDENTIFICATION_AUDIENCE,
> _NL_IDENTIFICATION_APPLICATION,
> _NL_IDENTIFICATION_ABBREVIATION,
> _NL_IDENTIFICATION_REVISION,
> _NL_IDENTIFICATION_DATE,
> _NL_IDENTIFICATION_CATEGORY,
> _NL_IDENTIFICATION_CODESET,
> _NL_NUM_LC_IDENTIFICATION,
>
> _NL_IDENTIFICATION_CODESET is merely one larger than the
> _NL_IDENTIFICATION_CATEGORY value. if CODESET didn't exist,
> then CATEGORY (a stringarray) would be at the end, but the
> array would still be defined/calculated correctly.
> -mike

Indeed, there is no bug for LC_IDENTIFICATION array because
_NL_IDENTIFICATION_CODESET is one larger than _NL_IDENTIFICATION_CATEGORY
although I wonder how to retrieve a specified element of
_NL_IDENTIFICATION_CATEGORY if there are no other indexes between
_NL_IDENTIFICATION_CATEGORY and _NL_IDENTIFICATION_CODESET.

I spotted this problem when working on bug 10871 [1]. I tried to add
ALTMON (and _NL_WALTMON) at the end of LC_TIME. The code expands to:

static const enum value_type _nl_value_type_LC_TIME [] = {
 [((int) (ABDAY_1) & 0xffff)] = stringarray,
 [((int) (DAY_1) & 0xffff)] = stringarray,
 [((int) (ABMON_1) & 0xffff)] = stringarray,
 [((int) (MON_1) & 0xffff)] = stringarray,
 [((int) (AM_STR) & 0xffff)] = stringarray,
 [((int) (D_T_FMT) & 0xffff)] = string,
 [((int) (D_FMT) & 0xffff)] = string,
 [((int) (T_FMT) & 0xffff)] = string,
 [((int) (T_FMT_AMPM) & 0xffff)] = string,
 [((int) (ERA) & 0xffff)] = stringlist,
 [((int) (ERA_YEAR) & 0xffff)] = string,
 [((int) (ERA_D_FMT) & 0xffff)] = string,
 [((int) (ALT_DIGITS) & 0xffff)] = stringlist,
 [((int) (ERA_D_T_FMT) & 0xffff)] = string,
 [((int) (ERA_T_FMT) & 0xffff)] = string,
 [((int) (_NL_TIME_ERA_NUM_ENTRIES) & 0xffff)] = word,
 [((int) (_NL_TIME_ERA_ENTRIES) & 0xffff)] = string,
 [((int) (_NL_WABDAY_1) & 0xffff)] = wstringarray,
 [((int) (_NL_WDAY_1) & 0xffff)] = wstringarray,
 [((int) (_NL_WABMON_1) & 0xffff)] = wstringarray,
 [((int) (_NL_WMON_1) & 0xffff)] = wstringarray,
 [((int) (_NL_WAM_STR) & 0xffff)] = wstringarray,
 [((int) (_NL_WD_T_FMT) & 0xffff)] = wstring,
 [((int) (_NL_WD_FMT) & 0xffff)] = wstring,
 [((int) (_NL_WT_FMT) & 0xffff)] = wstring,
 [((int) (_NL_WT_FMT_AMPM) & 0xffff)] = wstring,
 [((int) (_NL_WERA_YEAR) & 0xffff)] = wstring,
 [((int) (_NL_WERA_D_FMT) & 0xffff)] = wstring,
 [((int) (_NL_WALT_DIGITS) & 0xffff)] = wstringlist,
 [((int) (_NL_WERA_D_T_FMT) & 0xffff)] = wstring,
 [((int) (_NL_WERA_T_FMT) & 0xffff)] = wstring,
 [((int) (_NL_TIME_WEEK_NDAYS) & 0xffff)] = byte,
 [((int) (_NL_TIME_WEEK_1STDAY) & 0xffff)] = word,
 [((int) (_NL_TIME_WEEK_1STWEEK) & 0xffff)] = byte,
 [((int) (_NL_TIME_FIRST_WEEKDAY) & 0xffff)] = byte,
 [((int) (_NL_TIME_FIRST_WORKDAY) & 0xffff)] = byte,
 [((int) (_NL_TIME_CAL_DIRECTION) & 0xffff)] = byte,
 [((int) (_NL_TIME_TIMEZONE) & 0xffff)] = string,
 [((int) (_DATE_FMT) & 0xffff)] = string,
 [((int) (_NL_W_DATE_FMT) & 0xffff)] = wstring,
 [((int) (_NL_TIME_CODESET) & 0xffff)] = string,
 [((int) (ALTMON_1) & 0xffff)] = stringarray,       /* I'm going to add this */
 [((int) (_NL_WALTMON_1) & 0xffff)] = wstringarray, /* I'm going to add this */
};

We have ABDAY_1 = (((2) << 16) | (0)) so ((int) (ABDAY_1) & 0xffff) == 0;
let's expand it more:

static const enum value_type _nl_value_type_LC_TIME [] = {
 [0] = stringarray,
 [7] = stringarray,
 [14] = stringarray,
 [26] = stringarray,
 [38] = stringarray,
 [40] = string,
 [41] = string,
 [42] = string,
 [43] = string,
 [44] = stringlist,
 [45] = string,
 [46] = string,
 [47] = stringlist,
 [48] = string,
 [49] = string,
 [50] = word,
 [51] = string,
 [52] = wstringarray,
 [59] = wstringarray,
 [66] = wstringarray,
 [78] = wstringarray,
 [90] = wstringarray,
 [92] = wstring,
 [93] = wstring,
 [94] = wstring,
 [95] = wstring,
 [96] = wstring,
 [97] = wstring,
 [98] = wstringlist,
 [99] = wstring,
 [100] = wstring,
 [101] = byte,
 [102] = word,
 [103] = byte,
 [104] = byte,
 [105] = byte,
 [106] = byte,
 [107] = string,
 [108] = string,
 [109] = wstring,
 [110] = string,       /* _NL_TIME_CODESET */
 [111] = stringarray,  /* ALTMON_1 - I'm going to add this */
 [123] = wstringarray, /* _NL_WALTMON_1 - I'm going to add this */
};

The problem arises in CATTEST macro which ensures that:

	  if (cnt >= (sizeof (_nl_value_type_LC_##cat)		\
		      / sizeof (_nl_value_type_LC_##cat[0])))	\
	    goto puntdata;					\

so if cnt is larger than or equal to the number of elements in
_nl_value_type_LC_TIME array then whole locale category is rejected.
Old version [2] raised an assertion here. The array above has 124
elements while we need 135 (_NL_WALTMON_12 will expand to 134). This
is fixed if we declare a fixed size of the array which equals
_NL_NUM_LC_TIME.

A similar problem is in LC_CTYPE category, it ends with a subarray of
_NL_CTYPE_EXTRA_MAP_1,..,_NL_CTYPE_EXTRA_MAP_14. Probably this is
the reason why the size of this category is not checked in loadlocale.c.

I'm not telling if this is the solution to bug 19084 [3] but I will
need this fix soon. I thought it might be worth to post it already in
case someone else had the same problem in another context and might
need this patch before 10871 is fixed. No problem if you don't want to
treat this one-liner as a separate patch and merge it (silently?) into
the final solution of 10871. Also I will appreciate if you suggest me
a better solution of this problem.

Best regards,

Rafal Luzynski

PS. Some months ago I had another solution for 10871 which had one more
field after _NL_WALTMON_1..12 therefore the problem did not occur.


---
Links:

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=10871
[2]
https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0062ace2292effc4135c15ea99b1931fea5e0203
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=19084


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]