-
Type:
MMF
-
Status: Resolved
-
Priority:
Major
-
Resolution: Fixed
WHY
As a developer, I know that regular expressions are powerful and can help to solve tricky problems elegantly. I also know that writing regular expressions is an art in itself and it's easy to fall into traps. It takes years to become a regular expressions master.
As a developer, I need rules to help me write regular expressions that really do what I expect them to do. I also want to avoid to have my software looping for ever or taking ages to run because I did a mistake in regular expressions.
WHAT
SonarQube 8.5 introduced a set of rules to help developers write efficient, error-free and safe regular expressions (MMF-2059).
The goal of this MMF is to go a step further and provide rules dedicated to:
- regexp that perform slowly
- regexp that but the machine on his knees
- regexp that are simply buggy because by definition they can't match anything
We expect to deliver these new rules:
SONARJAVA-3554Rule S5998: Regular expressions should not overflow the stackSONARJAVA-3552Rule S5996: Regex boundaries should not be used in a way that can never matchSONARJAVA-3550Rule S5994: Regex patterns following a possessive quantifier should not always failSONARJAVA-3557Rule S6001: Back references in regular expressions should only refer to capturing groups that are matched before the referenceSONARJAVA-3560Rule S6002: Regex lookahead assertions should not be contradictory
HOW
We will develop:
- a new model by adding automaton states to our existing regular expression AST
- a common search methods in states graph, like finding the shortest path or the intersection
All the research about how to implement the automaton has already been done and exhaustively described in:
SONARJAVA-3549Add support for automata-based analyses for regular expressions
Common helper methods have also been described in:
SONARJAVA-3551Implement helper to find shortest path in regex automataSONARJAVA-3564Implement intersects and supersetOf helper for regex automata
- breaks down into
-
SONARJAVA-3550 Rule S5994: Regex patterns following a possessive quantifier should not always fail
-
- Closed
-
-
SONARJAVA-3552 Rule S5996: Regex boundaries should not be used in a way that can never match
-
- Closed
-
-
SONARJAVA-3554 Rule S5998: Regular expressions should not overflow the stack
-
- Closed
-
-
SONARJAVA-3557 Rule S6001: Back references in regular expressions should only refer to capturing groups that are matched before the reference
-
- Closed
-
-
SONARJAVA-3560 Rule S6002: Regex lookahead assertions should not be contradictory
-
- Closed
-
-
SONARJAVA-3566 Rule S5855: Regex alternatives should not be redundant
-
- Closed
-
-
SONARJAVA-3567 Rule S6019: Reluctant quantifiers in regular expressions should be followed by an expression that can't match the empty string
-
- Closed
-
-
SONARJAVA-3572 Rule S6035: Single-character alternations in regular expressions should be replaced with character classes
-
- Closed
-
-
SONARJAVA-3610 Rule S6070: The regex escape sequence \cX should only be used with characters in the @-_ range
-
- Closed
-
-
SONARJAVA-3549 Add support for automata-based analyses for regular expressions
-
- Closed
-
-
SONARJAVA-3551 Implement helper to find whether state in regex automaton is reachable without consuming input
-
- Closed
-
-
SONARJAVA-3564 Implement intersects and supersetOf helper for regex automata
-
- Closed
-
-
SONARJAVA-3561 AbstractRegexCheck should target more regex providers
-
- Closed
-
-
SONARJAVA-3569 Improve issue locations of S5869
-
- Closed
-
-
SONARJAVA-3483 FN in S5869 with escaped character classes
-
- Closed
-
- is related to
-
SONARJAVA-3568 S5852 should use automata to increase its accuracy
-
- Closed
-
-
SONARJAVA-3624 Regex FP/FN with Supplementary Multilingual Plane
-
- Closed
-
- relates to
-
MMF-2059 SQ/SC help Java developers writing efficient, error-free and safe regular expressions
-
- Resolved
-
-
MMF-2271 [PHP] Help developers writing better regular expressions
-
- To Do
-