This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] wcsmbs: Avoid escaped character literals in <wchar.h>
- From: Florian Weimer <fw at deneb dot enyo dot de>
- To: Andreas Schwab <schwab at suse dot de>
- Cc: libc-alpha at sourceware dot org
- Date: Mon, 17 Feb 2020 13:47:22 +0100
- Subject: Re: [PATCH] wcsmbs: Avoid escaped character literals in <wchar.h>
- References: <8736b9scqm.fsf@oldenburg2.str.redhat.com> <mvm5zg577f2.fsf@suse.de>
* Andreas Schwab:
> On Feb 17 2020, Florian Weimer wrote:
>
>> They confuse scripts/conformtest.py because it treats the L and the
>
> s/scripts/conform/
>
>> x7f as namespace-violating identifiers.
>
> Can the script be fixed not to do that?
Like this?
A more elaborate alternative would be to use Zack's C tokenizer in the
conform tests, but I don't know if its feature set is aligned with
what we need in the conform tests.
Subject: Add wide and character literal support to conform/conformtest.py
Without this change, tokens such as L'x7f' are reconginzed as a
identifiers L, x7f, which are not in the implementation namespace and
therefore trigger failures.
diff --git a/conform/conformtest.py b/conform/conformtest.py
index 951e3b2420..3bdc2a8e57 100644
--- a/conform/conformtest.py
+++ b/conform/conformtest.py
@@ -638,7 +638,7 @@ class HeaderTests(object):
# constants, and hex floats may be wrongly split into
# tokens including identifiers, but this is sufficient
# in practice and matches the old perl script.
- line = re.sub(r'"[^"]*"', '', line)
+ line = re.sub(r'(?:\bL)?(?:"[^"]*"|\'[^\']*\')', '', line)
line = line.strip()
for token in re.split(r'[^A-Za-z0-9_]+', line):
if re.match(r'[A-Za-z_]', token):