As a developer, I know that regular expressions are powerful and can help to solve tricky problems elegantly. I also know that writing regular expressions is an art in itself and it's easy to fall into traps. It takes years to become a regular expressions master.
As a developer, I need rules to help me write regular expressions that really do what I expect them to do. I also want to avoid to have my software looping for ever or taking ages to run because I did a mistake in regular expressions.
SonarQube 8.5 introduced a set of rules to help developers write efficient, error-free and safe regular expressions (
The goal of this MMF is to go a step further and provide rules dedicated to:
- regexp that perform slowly
- regexp that but the machine on his knees
- regexp that are simply buggy because by definition they can't match anything
We expect to deliver these new rules:
SONARJAVA-3554Rule S5998: Regular expressions should not overflow the stack SONARJAVA-3552Rule S5996: Regex boundaries should not be used in a way that can never match SONARJAVA-3550Rule S5994: Regex patterns following a possessive quantifier should not always fail SONARJAVA-3557Rule S6001: Back references in regular expressions should only refer to capturing groups that are matched before the reference SONARJAVA-3560Rule S6002: Regex lookahead assertions should not be contradictory
We will develop:
- a new model by adding automaton states to our existing regular expression AST
- a common search methods in states graph, like finding the shortest path or the intersection
All the research about how to implement the automaton has already been done and exhaustively described in:
SONARJAVA-3549Add support for automata-based analyses for regular expressions
Common helper methods have also been described in: