This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: support C/C++ identifiers named with non-ASCII characters
> From: <Paul.Koning@dell.com>
> CC: <simark@simark.ca>, <zjz@zjz.name>, <gdb-patches@sourceware.org>
> Date: Mon, 21 May 2018 18:03:17 +0000
>
> > Is it a fact that non-ASCII identifiers must be encoded in UTF-8, and
> > can not include invalid UTF-8 sequences?
>
> Encoding is a I/O question.
Not necessarily.
I asked that question because scanning a string for certain ASCII
characters using a 'char *' pointer will only work reliably if the
string is in UTF-8 or in some single-byte encoding. Otherwise, we
might find false hits for the delimiters, which are actually parts of
multibyte sequences.