Bug 29642 - `regcomp` with multiple adjacent plus sign would exhaust memory quickly
Summary: `regcomp` with multiple adjacent plus sign would exhaust memory quickly
Status: UNCONFIRMED
Alias: None
Product: glibc
Classification: Unclassified
Component: regex (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-02 07:10 UTC by jy l
Modified: 2023-08-24 15:05 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
regex DOS poc (216 bytes, text/plain)
2022-10-02 07:12 UTC, jy l
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jy l 2022-10-02 07:10:41 UTC
Hi! We found that in the latest pull, when `regcomp` with `REG_EXTENDED` is compiling pattern with multiple adjacent '+', like "1*++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++", the memory would be exhausted very quickly.
Because in `duplicate_tree` it exponentially calls `create_token_tree` which malloc all the memory, looks like it's easy to cause serious DOS.
Checked the regex specification that said "multiple adjacent duplication symbols ( '+', '*', '?', and intervals) produces undefined results.", and seems like other regex implementation have handled this, maybe glibc needs to handle it too?
Comment 1 jy l 2022-10-02 07:12:37 UTC
Created attachment 14373 [details]
regex DOS poc

please be caution to run it since it might exhaust all the memory within few seconds
Comment 2 Jonathan Wakely 2023-08-24 15:01:49 UTC
Looks like a dup of PR 28864
Comment 3 Jonathan Wakely 2023-08-24 15:05:45 UTC
Which seems to be a dup of PR 20095