Your INTERMEDIATE_ENCODING patch for Solaris

Tom Tromey tromey@redhat.com
Thu Aug 19 16:09:00 GMT 2010


>>>>> "Pierre" == Pierre Muller <pierre.muller@ics-cnrs.unistra.fr> writes:

Pierre>    Hi Tom,

Pierre>   I think your patch still has a problem:
Pierre> if I have a system (like my i386 open solaris)
Pierre> that only has the libc iconv functions, but no libiconv)
Pierre> The libc headers do not set __STDC_ISO_10646__ 
Pierre> nor defined _LIBICONV_VERSION macro.

Pierre>   Thus we do not know how to handle those iconv functions.

I don't really understand.  Does this show up as a compile-time error?
Or a runtime problem?  Could you give more details?

Pierre> I think that we should simply ignore them by defining
Pierre> PHONY_ICONV in such cases.

Yeah, we can do this.

This reduces functionality a lot, and since Solaris does have a mostly
working iconv it seems like we should try to use as much as we can.
But, Solaris seems to make this as difficult as possible.

Pierre> PS: libc iconv on Solaris seems to work more or less,
Pierre> but it doesn't like 'ASCII' charset (don't know why..)
Pierre> but code in _initialize_charset does transform '646' into 'ASCII'
Pierre> which is bad.

Oh, interesting.  The reports I have heard are that, on Solaris,
nl_langinfo(CODESET) returns "646", which is not a valid argument to
iconv_open.  I thought we knew that "ASCII" did work, but maybe it is
version dependent or something.

I think this is moot if we make the PHONY_ICONV change.

Pierre> PS2: what about being able to disable iconv at configure level with
Pierre> --disable-iconv
Pierre> that would skip iconv checks and thus lead to
Pierre> PHONY_ICONV?

I'd prefer to avoid more configure options unless absolutely necessary.
This area already has a lot of modes.

Here's a new version that should disable iconv on this platform, among
others.

Tom

b/gdb/ChangeLog:
2010-08-19  Tom Tromey  <tromey@redhat.com>

	* charset.c (iconv_open): New define.
	(iconv): Likewise.
	(iconv_close): Likewise.
	(phony_iconv_open): Add "phony_" prefix.
	(phony_iconv_close): Likewise.
	(phony_iconv): Likewise.
	* gdb_wchar.h: Check _LIBICONV_VERSION, __STDC_ISO_10646__.
	Change how INTERMEDIATE_ENCODING is defined.

diff --git a/gdb/charset.c b/gdb/charset.c
index 43b7fa8..3274219 100644
--- a/gdb/charset.c
+++ b/gdb/charset.c
@@ -90,8 +90,11 @@
 #undef iconv_t
 #define iconv_t int
 #undef iconv_open
+#define iconv_open phony_iconv_open
 #undef iconv
+#define iconv phony_iconv
 #undef iconv_close
+#define iconv_close phony_iconv_close
 
 #undef ICONV_CONST
 #define ICONV_CONST const
@@ -106,7 +109,7 @@
 #endif
 
 iconv_t
-iconv_open (const char *to, const char *from)
+phony_iconv_open (const char *to, const char *from)
 {
   /* We allow conversions from UTF-32BE, wchar_t, and the host charset.
      We allow conversions to wchar_t and the host charset.  */
@@ -122,14 +125,14 @@ iconv_open (const char *to, const char *from)
 }
 
 int
-iconv_close (iconv_t arg)
+phony_iconv_close (iconv_t arg)
 {
   return 0;
 }
 
 size_t
-iconv (iconv_t utf_flag, const char **inbuf, size_t *inbytesleft,
-       char **outbuf, size_t *outbytesleft)
+phony_iconv (iconv_t utf_flag, const char **inbuf, size_t *inbytesleft,
+	     char **outbuf, size_t *outbytesleft)
 {
   if (utf_flag)
     {
diff --git a/gdb/gdb_wchar.h b/gdb/gdb_wchar.h
index fca3fe4..7b0f958 100644
--- a/gdb/gdb_wchar.h
+++ b/gdb/gdb_wchar.h
@@ -23,20 +23,24 @@
    
    Capable systems have the full suite: wchar_t support and iconv
    (perhaps via GNU libiconv).  On these machines, full functionality
-   is available.
+   is available.  Note that full functionality is dependent on us
+   being able to convert from an arbitrary encoding to wchar_t.  In
+   practice this means we look for __STDC_ISO_10646__ (where we know
+   the name of the wchar_t encoding) or GNU libiconv, where we can use
+   "wchar_t".
    
    DJGPP is known to have libiconv but not wchar_t support.  On
    systems like this, we use the narrow character functions.  The full
    functionality is available to the user, but many characters (those
    outside the narrow range) will be displayed as escapes.
    
-   Finally, some systems do not have iconv.  Here we provide a phony
-   iconv which only handles a single character set, and we provide
-   wrappers for the wchar_t functionality we use.  */
+   Finally, some systems do not have iconv, or are really broken
+   (e.g., Solaris, which almost has all of this working, but where
+   just enough is broken to make it too hard to use).  Here we provide
+   a phony iconv which only handles a single character set, and we
+   provide wrappers for the wchar_t functionality we use.  */
 
 
-#define INTERMEDIATE_ENCODING "wchar_t"
-
 #if defined (HAVE_ICONV)
 #include <iconv.h>
 #else
@@ -45,9 +49,15 @@
 #define PHONY_ICONV
 #endif
 
-/* We use "btowc" as a sentinel to detect functioning wchar_t
-   support.  */
-#if defined (HAVE_ICONV) && defined (HAVE_WCHAR_H) && defined (HAVE_BTOWC)
+/* We use "btowc" as a sentinel to detect functioning wchar_t support.
+   We check for either __STDC_ISO_10646__ or a new-enough libiconv in
+   order to ensure we can convert to and from wchar_t.  We choose
+   libiconv version 0x10D because it was reported that earlier
+   versions do not always accept "wchar_t" as an encoding
+   argument.  */
+#if defined (HAVE_ICONV) && defined (HAVE_WCHAR_H) && defined (HAVE_BTOWC) \
+  && (defined (__STDC_ISO_10646__) \
+      || (defined (_LIBICONV_VERSION) && _LIBICONV_VERSION >= 0x10D))
 
 #include <wchar.h>
 #include <wctype.h>
@@ -63,7 +73,32 @@ typedef wint_t gdb_wint_t;
 
 #define LCST(X) L ## X
 
+/* If __STDC_ISO_10646__ is defined, then the host wchar_t is UCS-4.
+   We exploit this fact in the hope that there are hosts that define
+   this but which do not support "wchar_t" as an encoding argument to
+   iconv_open.  We put the endianness into the encoding name to avoid
+   hosts that emit a BOM when the unadorned name is used.  */
+#if defined (__STDC_ISO_10646__)
+#if WORDS_BIGENDIAN
+#define INTERMEDIATE_ENCODING "UCS-4BE"
 #else
+#define INTERMEDIATE_ENCODING "UCS-4LE"
+#endif
+#elif defined (_LIBICONV_VERSION) && _LIBICONV_VERSION >= 0x10D
+#define INTERMEDIATE_ENCODING "wchar_t"
+#else
+/* This shouldn't happen, because the earlier #if should have filtered
+   out this case.  */
+#error "Neither __STDC_ISO_10646__ nor _LIBICONV_VERSION defined"
+#endif
+
+#else
+
+/* If we got here and have wchar_t support, we might be on a system
+   with some problem.  So, we just disable everything.  */
+#if !(defined (HAVE_WCHAR_H) && defined (HAVE_BTOWC))
+#define PHONY_ICONV
+#endif
 
 typedef char gdb_wchar_t;
 typedef int gdb_wint_t;
@@ -80,8 +115,9 @@ typedef int gdb_wint_t;
    narrow encoding as our intermediate encoding.  However, if we are
    also providing a phony iconv, we might as well just stick with
    "wchar_t".  */
-#ifndef PHONY_ICONV
-#undef INTERMEDIATE_ENCODING
+#ifdef PHONY_ICONV
+#define INTERMEDIATE_ENCODING "wchar_t"
+#else
 #define INTERMEDIATE_ENCODING host_charset ()
 #endif
 



More information about the Gdb-patches mailing list