## Reproduce Procedures ``` gdb -tui main ``` Then `set extended-prompt \w \f:\t\n❯ ` After inputting each character, the prompt string will be messed. it looks non-ascii character cannot display correctly in tui-mode. OS: 5.15.11
(gdb) set extended-prompt \w \f:\t\n❯ Python Exception <type 'exceptions.UnicodeEncodeError'>: 'ascii' codec can't enc ode character u'\u276f' in position 10: ordinal not in range(128)
Andreas, This is a slightly different issue you are seeing. I'm guessing you have gdb.prompt_hook set. This ends up calling gdbpy_before_prompt_hook in python.c. If we assume Python 3 for a moment, then in this function we convert the prompt to a unicode object, assuming UTF-8 encoding. This unicode object is then passed to the users python code. If the user returns the same prompt unchanged, or even some other utf-8 encoded prompt string, we then convert that string back to bytes using the host_charset. From the error message you see, it would appear your hostchar set is maybe 'ascii'? I'm guessing it's certainly not utf-8. You could try: 'set host-charset UTF8' and see if the problem is resolved. The asymmetry in our use of different unicode encodings seems like a bad thing to me ... I wonder if we should just fix on one particular scheme, maybe utf-8 for some of the cases like this? However, we should probably spin this conversation into a separate bug as this is different to the original unicode within tui bug.
The prompt is printed by tui_puts_internal, which outputs every byte in the string individually. As demonstrator patch, by making tui_puts_internal behave more like tui_puts, that is, output entire strings: ... diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c index a1eadcd937d..5cc26f02174 100644 --- a/gdb/tui/tui-io.c +++ b/gdb/tui/tui-io.c @@ -521,8 +521,23 @@ tui_puts_internal (WINDOW *w, const char *string, int *height) int prev_col = 0; bool saw_nl = false; - while ((c = *string++) != 0) + while (true) { + const char *next = strpbrk (string, "\n\1\2\033\t"); + + /* Print the plain text prefix. */ + size_t n_chars = next == nullptr ? strlen (string) : next - string; + if (n_chars > 0) + waddnstr (w, string, n_chars); + + /* We finished. */ + if (next == nullptr) + break; + + c = *next; + if (c == 0) + break; + if (c == '\n') saw_nl = true; @@ -530,6 +545,7 @@ tui_puts_internal (WINDOW *w, const char *string, int *height) { /* Ignore these, they are readline escape-marking sequences. */ + ++next; } else { @@ -538,10 +554,12 @@ tui_puts_internal (WINDOW *w, const char *string, int *height) size_t bytes_read = apply_ansi_escape (w, string - 1); if (bytes_read > 0) { - string = string + bytes_read - 1; + next = next + bytes_read - 1; continue; } } + else + next++; do_tui_putc (w, c); if (height != nullptr) @@ -552,6 +570,8 @@ tui_puts_internal (WINDOW *w, const char *string, int *height) prev_col = col; } } + + string = next; } if (TUI_CMD_WIN != nullptr && w == TUI_CMD_WIN->handle.get ()) update_cmdwin_start_line (); ... I can see this that the behaviour is now correct: ... └────────────────────────────────────────────────────────────────┘ None No process In: ?? PC: ?? /data/vries/gdb <no frame>:<no attribute num on current thread> ❯ ... I'm not sure yet if this is a proper fix, I suspect that'll involve accumulating using mbrtowc or some such.
Created attachment 14907 [details] Tentative patch
Submitted patch: https://sourceware.org/pipermail/gdb-patches/2023-May/199880.html