Bug 29315 - Completion warning: could not convert ... from the host encoding (ANSI_X3.4-1968) to UTF-32
Summary: Completion warning: could not convert ... from the host encoding (ANSI_X3.4-1...
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: cli (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-07-01 15:09 UTC by Maciej W. Rozycki
Modified: 2023-10-05 08:55 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Maciej W. Rozycki 2022-07-01 15:09:28 UTC
Following the breakage from PR cli/29314, when debugging `cc1' from GCC
12, which has a huge number of template class functions, e.g.:

hash_table<action_record_hasher, false, xcallocator>::expand()
hash_table<action_record_hasher, false, xcallocator>::find_slot_with_hash(action_record* const&, unsigned int, insert_option)
hash_table<action_record_hasher, false, xcallocator>::~hash_table()
hash_table<addr_hasher, false, xcallocator>::expand()
[1100+ entries follow]

in some cases if you try to continue completing an incorrect completion
produced, then a warning is issued as follows:

(gdb) break hash_tab le<<Tab>
(gdb) break hash_tab le<warning: could not convert 'hash_tab le<' from the host encoding (ANSI_X3.4-1968) to UTF-32.
This normally should not happen, please file a bug report.

and GDB hangs for a couple of minutes before silently returning to the 
prompt (i.e. it's not shown unless e.g. ^U is entered at this point).

This actually asks for a bug report, so here you go.

I can reproduce it with:

$ gdb /path/to/cc1

(having compiled `cc1' with debug info of course) on a native
RISC-V/Linux system, specifically the SiFive Freedom U SDK Linux
distribution provided with the HiFive Unmatched development system:

$ cat /etc/os-release
ID=nodistro
NAME="FreedomUSDK"
VERSION="2021.03.01 (2021March)"
VERSION_ID=2021.03.01
PRETTY_NAME="FreedomUSDK 2021.03.01 (2021March)"
$ 

I have no idea why a warning pops up from gdb/ada-lang.c for an
executable that contains no Ada code:

(gdb) show language
The current source language is "auto; currently c++".
(gdb) 

I may be able to provide further details later.
Comment 1 Patrick Monnerat 2022-09-03 18:56:05 UTC
I have a similar problem though not exactly the same:
It occurs in mingw gdb compiled under dygwin.

I've investigated a bit and found it appeared in commit https://sourceware.org/git/?p=binutils-gdb.git;a=commit;f=gdb/ada-lang.c;h=315e4ebb4b7ef01da2f5c419edc74f39a0122d20 that tries to convert to a 32-bit character set.

In the mingw context, current code conditionals cause libiconv to be replaced by a minimal phony_iconv that does not support such conversions. These conditionals were implemented to make sure libiconv implements UTF32 and fail on mingw because the wchar_t is 16-bit wide and the installed libiconv is not the Bruno Haible's one. See PHONY_ICONV in gdb/gdb_wchar.h.

As there is no safe way to determine if UTF32 is supported in cross-compilations, and since all libiconv implementations younger than ~10 years have it, I'll be tempted to assume it's always present and drop some conditionals.

@maciej: is your context similar (no Haible's libiconv or too old and no ISO-10646 support) ?
@tom: as you're the commit's author, what is your opinion ?
Comment 2 Tom Tromey 2022-09-03 21:15:56 UTC
(In reply to Patrick Monnerat from comment #1)

> In the mingw context, current code conditionals cause libiconv to be
> replaced by a minimal phony_iconv that does not support such conversions.
> These conditionals were implemented to make sure libiconv implements UTF32
> and fail on mingw because the wchar_t is 16-bit wide and the installed
> libiconv is not the Bruno Haible's one. See PHONY_ICONV in gdb/gdb_wchar.h.

Are there any defines in the mingw libiconv headers that could be used
to detect this situation?  E.g., for Haible's we use _LIBICONV_VERSION.
Anyway I'd be curious to know exactly which situation makes gdb decide
not to use iconv in your configuration.

> @maciej: is your context similar (no Haible's libiconv or too old and no
> ISO-10646 support) ?

Ideally this path shouldn't even be taken.  However it may be hard
to avoid due to a combination of factors including multi-language searches.

One question I had is whether the required iconv converters are installed.
It sounds like he's using glibc...

> @tom: as you're the commit's author, what is your opinion ?

In the past we couldn't assume host iconv was any good.  Solaris in
particular was broken (I forget how).  I suppose I'm ok with changing
this decision though it might be good to find out how Solaris is doing.
Comment 3 Patrick Monnerat 2022-09-03 23:29:02 UTC
(In reply to Tom Tromey from comment #2)

Thanks for your reply.

> Are there any defines in the mingw libiconv headers that could be used
> to detect this situation?
Not at all :-( I already checked it.
It only declares iconv_t, iconv_open(), iconv_close() and iconv().

> Anyway I'd be curious to know exactly which situation makes gdb decide
> not to use iconv in your configuration.
HAVE_ICONV defined
HAVE_BTOWC defined
__STDC_ISO_10646__ undefined   (sizeof(wchar_t) = 2)
_LIBICONV_VERSION undefined

Annotated lines from gdb/gdb_wchar.h

#if defined (HAVE_ICONV)             ; defined, so iconv.h is included.

#if defined (HAVE_ICONV) && defined (HAVE_BTOWC) \
  && (defined (__STDC_ISO_10646__) \
      || (defined (_LIBICONV_VERSION) && _LIBICONV_VERSION >= 0x108))
                                     ; evaluates to false


#else

/* If we got here and have wchar_t support, we might be on a system
   with some problem.  So, we just disable everything.  */
#if defined (HAVE_BTOWC)
#define PHONY_ICONV                    ; <---- decision made here.
#endif

> One question I had is whether the required iconv converters are installed.
> It sounds like he's using glibc...
Of course, if Maciej has no libiconv at all, it'll always use phony iconv !
Glibc iconv.h does not define _LIBICONV_VERSION: if he uses it and sizeof(wchar_t) == 2, then phony is also enabled.
The problem does not exist on Linux as sizeof(wchar_t) == 4.

> In the past we couldn't assume host iconv was any good.  Solaris in
> particular was broken (I forget how).  I suppose I'm ok with changing
> this decision though it might be good to find out how Solaris is doing.
I don't have a Solaris platform here for tests.
encode_1() 
Alternative solutions are:
- A config parameter for iconv UTF32 support. If not provided, try to determine it via AC_TRY_RUN and assume not supported in cross-builds.
- Determine libiconv usability at run-time and switch to phony if needed.
- Extend phony iconv to support conversions to UTF32.
- Rework conversions in gdb/ada-lang.c, if feasible.
- In ada_fold_name(), pre-check for non-ascii codes like it is done in ada_encode_1() and do not convert if none.
Comment 4 Patrick Monnerat 2022-09-06 13:03:54 UTC
@Maciej: Here is a dirty workaround you may try:

1) Before configuring gdb, make sure a usable libiconv supporting UTF32-LE is installed for development.
2) If it is not the Haible's implementation of libiconv, add
       -D_LIBICONV_VERSION=0x108
   to CFLAGS and CXXFLAGS.

This works here for mingw.
Comment 5 Robert French 2023-10-01 17:11:58 UTC
> I don't have a Solaris platform here for tests.

@Patrick I can make one available to you if that would help! I am still seeing this issue in gdb 13.2 on OmniOS (illumos) and have heard about it on OpenIndiana and Oracle Solaris 11.4 as well.
Comment 6 Patrick Monnerat 2023-10-01 17:59:27 UTC
> I can make one available to you if that would help!

Thank you very much for the proposal, but I don't think I will investigate more about it as this brings me very far from the Insight GUI, the initial simple patch I submitted being not welcomed.
Since then, I successfully use the mingw workaround I suggested in comment 4.

> I am still seeing this issue in gdb 13.2 on OmniOS (illumos) and have heard about it on OpenIndiana and Oracle Solaris 11.4 as well.

This comforts me into thinking it should be resolved by someone who masters these OSes.

Thanks a lot anyway!
Comment 7 Paul Floyd 2023-10-05 08:55:57 UTC
Valgrind maintainer hat on.

This issue causes several failures in the valgrind regression tests with Illumos (I have an openindiana hipster VM that I use for testing). From the sounds of it also OmniOS and Solaris 11.4.

I'll try to see what the status is with iconv.