Hello, I am calling nl_langinfo (_NL_TIME_WEEK_1STDAY); which is supposed to return an integer like 19971130, but I am getting 0x888888880130bc3a, i.e. (0x8888888800000000 | 19971130). Looking at __nl_langinfo_l code, I can see: return (char *) data->values[index].string; where string is member of union locale_data_value { const uint32_t *wstr; const char *string; unsigned int word; /* Note endian issues vs 64-bit pointers. */ } and indeed I can read locale/categories.def: DEFINE_ELEMENT (_NL_TIME_WEEK_1STDAY, "week-1stday", std, word) I guess maybe the union gets loaded through the word member only, thus leaving the higher part of the string member uninitialized? Note that I am using MALLOC_PERTURB_=$RANDOM, without it the problem disappears.
Indeed, the other word of the union is uninitialized. It's easy to make sure it's zero. But I wonder about that usage mode. Does it work correctly (without provoking this bug) on big-endian machines?
It completely fails on sparc64 indeed: it returns 0x130bc3a00000000 (and 0x130bc3a2d2d2d2d with MALLOC_PERTURB_=1234)
You haven't provided the testcase, but from what you say I'd say it is a user error. nl_langinfo returns the pointer from the union, so if you need the word instead, you need to: union { char *str; unsigned int word; } u; u.str = nl_langinfo (...); xxx = u.word;
Where is that documented?
nl_langinfo is only documented to be able to return string properties of the locale.
I know that, that's why I'm asking for documentation, as since it's currently documented that way, I had assumed that char* had to be casted to scalar. It's only now that I have actually read the source code that I know it's not done that way. It really needs documentation and/or fix.
Why? nl_langinfo is only documented for a couple of values, see http://www.opengroup.org/onlinepubs/9699919799/basedefs/langinfo.h.html The rest is undocumented, so if you call nl_langinfo with such arguments, it is implementation defined behavior. A quick google query would tell you what you need to do...
"implementation-defined" doesn't mean that the implementation doesn't have do document it. On the contrary, I'd say. That's why I'm again asking for documentation in e.g. the glibc info. Now, as you say, quick google query. That gives me mail.gnome.org/archives/hildon-list/2008-August/msg00000.html langinfo = nl_langinfo(_NL_TIME_WEEK_1STDAY); week_origin = GPOINTER_TO_INT(langinfo); as well as http://www.mail-archive.com/rrd-developers@lists.oetiker.ch/msg03613.html long week_1stday_l = (long) nl_langinfo (_NL_TIME_WEEK_1STDAY); Eventually I got to http://sourceware.org/bugzilla/show_bug.cgi?id=5486 which boils down to exactly the same thing I'm asking now: please document this glibc-only behavior.
Also, it looks very odd that the user has to define the union himself. Shouldn't that be defined in langinfo.h? (I'd consider that that alone would be enough for some minimal documentation).
The names starting with _NL are all internal to glibc. All supported properties are strings.
That should be documented then. And I'd thus turn this bug into "please provide supported *WEEK* langinfo items", as all calendar applications need these.
ISO/IEC 9899:1999 7.1.3 Reserved identifiers - All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
Mmm, but there are at least _GNU_SOURCE, _IONBF & such which fall in that area, as well as stdio_ext.h functions... Well, anyway, let's turn the bug into "please support these!": There are a lot of very useful locale information that langinfo could provide (paper dimension, calendar layout), but discussion within this bug says that they are not supported. Actually, a lot of them are not even used inside glibc (e.g. _NL_PAPER_HEIGHT), so I'm wondering why they are here at all if one can not assume that they are supported. Please provide supported equivalents to these _NL_* langinfo items so applications can use them.