Uploaded image for project: 'Rules Repository'
  1. Rules Repository
  2. RSPEC-6070

The regex escape sequence \cX should only be used with characters in the @-_ range

    XMLWordPrintable

    Details

    • Type: Bug Detection
    • Status: Active
    • Resolution: Unresolved
    • Labels:
    • Message:
      Remove or replace this problematic use of \c
    • Highlighting:
      Hide

      The \cX sequence.

      Show
      The \cX sequence.
    • Default Severity:
      Major
    • Impact:
      Low
    • Likelihood:
      High
    • Default Quality Profiles:
      Sonar way
    • Covered Languages:
      Java
    • Remediation Function:
      Constant/Issue
    • Constant Cost:
      5min
    • Analysis Scope:
      Main Sources, Test Sources

      Description

      In regular expressions the escape sequence \cX, where the X stands for any character that's either @, any capital ASCII letter, [, \, ], ^ or _, represents the control character that "corresponds" to the character following \c, meaning the control character that comes 64 bytes before the given character in the ASCII encoding.

      In some other regex engines (for example in that of Perl) this escape sequence is case insensitive and \cd produces the same control character as \cD. Further using \c with a character that's neither @, any ASCII letter, [, \, ], ^ nor _, will produce a warning or error in those engines. Neither of these things is true in Java, where the value of the character is always XORed with 64 without checking that this operation makes sense. Since this won't lead to a sensible result for characters that are outside of the @ to _ range, using \c with such characters is almost certainly a mistake.

      Noncompliant Code Example

      Pattern.compile("\\ca"); // Noncompliant, 'a' is not an upper case letter
      Pattern.compile("\\c!"); // Noncompliant, '!' is outside of the '@'-'_' range
      

      Compliant Solution

      Pattern.compile("\\cA"); // Compliant, this will match the "start of heading" control character
      Pattern.compile("\\c^"); // Compliant, this will match the "record separator" control character
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              sebastian.hungerecker Sebastian Hungerecker
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated: