codeset problems in wprintf and wcsftime

Corinna Vinschen vinschen@redhat.com
Sat Apr 3 15:32:00 GMT 2010


On Apr  3 11:10, Andy Koppe wrote:
> Corinna Vinschen:
> >> But that's got the M_ASSIGN_CHAR macro and cnv function for converting
> >> ASCII numbers in the locale file to binary. I can't see anything
> >> similar for mb_cur_max.
> >
> > I'm talking about the mechanism to overwrite a string with the numerical
> > value which can be used here as well.  This isn't in the file yet since
> > no target uses this part of the code.
> 
> Fair enough.
> 
> Another one to keep in mind should the locale-loading code ever be
> used: with __HAVE_LOCALE_INFO_EXTENDED__, the locale files would
> currently have to be a mix of byte-based string and wide strings, so
> some sort of conversion scheme might be needed.

Why not store the wide char strings in the file?  As you realized, the
structure of the files is not yet laid out.  If you like you can even
store mb_cur_max in the file as a binary value.  Or not at all since we
entirely do this in loadlocale, just like today.

> > I don't get that.
> 
> What I mean is, leave out the __get_current_ctype_locale() calls added
> to __locale_charset() and __locale_mb_cur_max():
> 
>   char *
>   _DEFUN_VOID(__locale_charset)
>   {
> - #ifdef __HAVE_LOCALE_INFO__
> -   return __get_current_ctype_locale ()->codeset;
> - #else
>     return lc_ctype_charset;
> - #endif
>   }
> 
>   int
>   _DEFUN_VOID(__locale_mb_cur_max)
>   {
> - #ifdef __HAVE_LOCALE_INFO__
> -   return __get_current_ctype_locale ()->mb_cur_max[0];
> - #else
>     return __mb_cur_max;
> - #endif
>   }

Why?  For a *temporary* gain?

> > __lc_ctype_charset will go away for targets supporting locales.
> > Consequentially we have to do the same for __mb_cur_max.
> 
> And I don't get that. In loadlocale() we already determine what the
> charset and mb_cur_max are, so there's no need to take the detour via
> the lc_ctype_T. Why not continue to stick them into the global
> variables at the moment, and have them as fields directly in the
> locale_t structure later?

Again, what's the gain in the long run?  Temporarily you have a slight
speedup, but this will have to change anyway.  The locale is not any
longer a process-wide property.

> Or is the plan that loadlocale() will no longer parse the locale
> string and determine the charset, and that the charset is determined
> solely by the locale file in the non-Cygwin case?

This is not yet planned territory.  Right now I don't care.

> > Each of these functions have access
> > to the reent context, which in turn will have a pointer to the locale_t,
> > which in turn stores the required information.
> 
> ActualIy I don't think the locale_t context should be stored in the
> reent context. Currently setlocale() affects all threads; if we store
> the locale in the reent struct instead, it would be thread-specific.
> The POSIX description of setlocale() explicitly says: "The locale
> state is common to all threads within a process." The point of the
> locale_t stuff is that it adds the _l variants of all the
> locale-dependent functions, which allow programs to create their own
> locale contexts as necessary.
> 
> Therefore I think there'd need to be one global locale_t instance
> that's used by the locale-dependent functions without _l.

Of course there must be a global locale_t.  But the reent structure
must have a pointer to it because the locale is no longer a global
property of the process.  The reent structure does not *contain* a
locale_t.  It *points* to the current locale_t of the thread, and that
locale_t *can* be the global locale_t, *or* a thread-specific locale.
See SUSv4 newlocale(3), uselocale(3), or, for an extra kick, stuff like
strcasecmp_l(3).


Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat



More information about the Newlib mailing list