Uploaded image for project: 'SonarJava'
  1. SonarJava
  2. SONARJAVA-3496

Detect performance problems of regular expressions with unions inside repetitions

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Rules
    • Labels:

      Description

      The regex (x|\n)* will crash with a stack overflow when applied to a string containing a sufficiently large amount of 'x's (1206 on my system with the default stack size). A similar regex caused crashes in sonarphp (see SONARPHP-1022).

      My current understanding is that any use of the | operator inside a non-possessive repetition can cause such a stack overflow in Java's regex engine, with less characters of input being requires the shorter the strings matched by the union are. Currently none of our rules warn about this, which is unfortunate.

      It is unclear to me whether a regex like this could be used for a denial of service attack or whether they can only lead to stack overflows (I suspect the latter), so it's unclear whether this should be covered by RSPEC-5852 or a new rule. This requires some further investigation.

      Note that while in this example the issue disappears if the union is replaced with a character class (i.e. [x\n] instead of x|\n), this is not possible in the general case.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              sebastian.hungerecker Sebastian Hungerecker
              Reporter:
              sebastian.hungerecker Sebastian Hungerecker
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: