The rust compiler is going to start emitting the char type using DW_ATE_UTF. See https://github.com/rust-lang/rust/pull/89887 This PR tracks this for the 11.x branch so that we can backport the patch.
The master branch has been updated by Tom Tromey <tromey@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=1c0e43634cfdd0ad7ef9ac3dd7d208dddeb80f5e commit 1c0e43634cfdd0ad7ef9ac3dd7d208dddeb80f5e Author: Tom Tromey <tom@tromey.com> Date: Sun Oct 31 10:34:50 2021 -0600 Allow DW_ATE_UTF for Rust characters The Rust compiler plans to change the encoding of a Rust 'char' type to use DW_ATE_UTF. You can see the discussion here: https://github.com/rust-lang/rust/pull/89887 However, this fails in gdb. I looked into this, and it turns out that the handling of DW_ATE_UTF is currently fairly specific to C++. In particular, the code here assumes the C++ type names, and it creates an integer type. This comes from commit 53e710acd ("GDB thinks char16_t and char32_t are signed in C++"). The message says: Both places need fixing. But since I couldn't tell why dwarf2read.c needs to create a new type, I've made it use the per-arch built-in types instead, so that the types are only created once per arch instead of once per objfile. That seems to work fine. ... which is fine, but it seems to me that it's also correct to make a new character type; and this approach is better because it preserves the type name as well. This does use more memory, but first we shouldn't be too concerned about the memory use of types coming from debuginfo; and second, if we are, we should implement type interning anyway. Changing this code to use a character type revealed a couple of oddities in the C/C++ handling of TYPE_CODE_CHAR. This patch fixes these as well. I filed PR rust/28637 for this issue, so that this patch can be backported to the gdb 11 branch.
The gdb-11-branch branch has been updated by Tom Tromey <tromey@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=29b161c9be240da341910f0206ffdd881daacd96 commit 29b161c9be240da341910f0206ffdd881daacd96 Author: Tom Tromey <tom@tromey.com> Date: Sun Oct 31 10:34:50 2021 -0600 Allow DW_ATE_UTF for Rust characters The Rust compiler plans to change the encoding of a Rust 'char' type to use DW_ATE_UTF. You can see the discussion here: https://github.com/rust-lang/rust/pull/89887 However, this fails in gdb. I looked into this, and it turns out that the handling of DW_ATE_UTF is currently fairly specific to C++. In particular, the code here assumes the C++ type names, and it creates an integer type. This comes from commit 53e710acd ("GDB thinks char16_t and char32_t are signed in C++"). The message says: Both places need fixing. But since I couldn't tell why dwarf2read.c needs to create a new type, I've made it use the per-arch built-in types instead, so that the types are only created once per arch instead of once per objfile. That seems to work fine. ... which is fine, but it seems to me that it's also correct to make a new character type; and this approach is better because it preserves the type name as well. This does use more memory, but first we shouldn't be too concerned about the memory use of types coming from debuginfo; and second, if we are, we should implement type interning anyway. Changing this code to use a character type revealed a couple of oddities in the C/C++ handling of TYPE_CODE_CHAR. This patch fixes these as well. I filed PR rust/28637 for this issue, so that this patch can be backported to the gdb 11 branch. (cherry picked from commit 1c0e43634cfdd0ad7ef9ac3dd7d208dddeb80f5e)
Fixed now.