This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
[Bug runtime/15065] regular expressions: subexpression capture support
- From: "serhei.public at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: systemtap at sourceware dot org
- Date: Thu, 03 Aug 2017 16:31:04 +0000
- Subject: [Bug runtime/15065] regular expressions: subexpression capture support
- Auto-submitted: auto-generated
- References: <bug-15065-6586@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=15065
Serhei Makarov <serhei.public at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |serhei.public at gmail dot com
--- Comment #1 from Serhei Makarov <serhei.public at gmail dot com> ---
Just a heads up that I'm working on this feature, with my current code at
https://github.com/serhei/stap-experiments/commits/serhei/tnfa. I wrote a
solution based on Laurikari's TNFA algorithm.
The testsuite
(https://github.com/serhei/stap-experiments/blob/serhei/tnfa/testsuite/runok/regex_grouping.stp)
needs to be a lot more complete before I'm willing to put confidence in this
feature. Current results look like this:
regex PASS: #1: aaa =~ a* with 1 groups 'aaa'
regex FAIL (grouping): #2: abab =~ (ab)* with 2 groups '' ''
regex PASS: #3: cabab =~ c(ab)* with 2 groups 'cabab' 'ab'
regex PASS: #4: aaa =~ (a*)a*a with 2 groups 'aaa' 'aa'
regex PASS: #5: regex =~ re(gex) with 2 groups 'regex' 'gex'
regex PASS: #6: longer =~ (long|longer) with 2 groups 'longer' 'longer'
regex PASS: #7: unrelated !~ regex
regex PASS: #8: \ =~ \\ with 1 groups '\'
regex PASS: #9: xabcy =~ abc with 1 groups 'abc'
regex PASS: #10: abbbbc =~ ab*bc with 1 groups 'abbbbc'
regex PASS: #11: abbc =~ ab?bc with 1 groups 'abbc'
regex PASS: #12: abcc !~ ^abc$
regex PASS: #13: abd !~ a[b-d]e
regex PASS: #14: ace =~ a[b-d]e with 1 groups 'ace'
regex PASS: #15: ab =~ a\(*b with 1 groups 'ab'
regex PASS: #16: a((b =~ a\(*b with 1 groups 'a((b'
regex PASS: #17: ab =~ (a+|b)* with 2 groups 'ab' 'b'
regex PASS: #18: ab =~ (a+|b)+ with 2 groups 'ab' 'b'
regex PASS: #19: abbbcd =~ ([abc])*d with 2 groups 'abbbcd' 'c'
regex PASS: #20: abcde !~ ^(ab|cd)e
regex PASS: #21: abcde =~ (ab|cd)e with 2 groups 'cde' 'cd'
regex PASS: #22: abcde =~ (ab|cd)e$ with 2 groups 'cde' 'cd'
regex PASS: #23: alpha =~ [A-Za-z_][A-Za-z0-9_]* with 1 groups 'alpha'
regex PASS: #24: ij =~ (bc+d$|ef*g.|h?i(j|k)) with 3 groups 'ij' 'ij' 'j'
regex PASS: #25: effg !~ (bc+d$|ef*g.|h?i(j|k))
regex PASS: #26: 00effg12 =~ (bc+d$|ef*g.|h?i(j|k)) with 3 groups 'effg1'
'effg1' ''
regex PASS: #27: bcccd =~ (bc+d$|ef*g.|h?i(j|k)) with 3 groups 'bcccd' 'bcccd'
''
regex PASS: #28: a =~ (((((((((a))))))))) with 10 groups 'a' 'a' 'a' 'a' 'a'
'a' 'a' 'a' 'a' 'a'
regex PASS: #29: (.*)\) !~ \((.*),
regex PASS: #30: ab !~ [k]
regex PASS: #31: abcd =~ abcd with 1 groups 'abcd'
regex PASS: #32: abcd =~ a(bc)d with 2 groups 'abcd' 'bc'
regex total PASS: 31, FAIL: 1
--
You are receiving this mail because:
You are the assignee for the bug.