Uploaded image for project: 'Product Roadmaps'
  1. Product Roadmaps
  2. MMF-1674

[INNOVATION] Support Semantic in SonarApex

    Details

    • Type: MMF
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Do
    • Labels:

      Description

      WHY

      SonarApex is one of our commercial plugins and we will soon have the opportunity to define new rules for this language with Apex experts. We need to know what are the limitations we will have with our SonarApex frontend. Thus we will use it as a starting point for our semantic analysis in SLANG.

      Some of those rule require semantic.

      WHAT

      The goal of this MMF is to see if there are any limitations in the semantic information we can collect from SonarApex frontend.

      We will try to implement the following rules for SonarApex:

      • SONARSLANG-371 [Apex] Rule S3329: Cypher Block Chaining IV's should be random and unique
        This is an innovation sprint and the result will be a prototype.
      • SONARSLANG-372 [Apex] Rule S5332: HTTPS protocol should be used to send HTTP requests

      HOW

      Previous Apex front-end analysis

      A previous attempt has been made, during a POC, to gather as much information as possible from the native Apex front-end. See sonar-apex-parser-poc/README.md.
      The conclusion was:

      • Using the native Apex front-end, it's not straightforward to generate an AST suitable for Slang that contains both: tokens locations and semantic information.

      The native Apex front-end does the following conversions:

      1. apex.jorje.parser.impl.ApexLexer that converts a character string of source code into a list of tokens
      2. apex.jorje.parser.impl.ApexParser that convert a list of tokens into a first AST using antlr (AST root is apex.jorje.data.ast.CompilationUnit, an AST visitor is not provided)
      3. apex.jorje.semantic.compiler.ApexCompiler that converts the first AST into a final AST containing semantic information (AST root is - apex.jorje.semantic.compiler.CodeUnit, an AST visitor is provided). We can call the compiler up to one of the following stage: PARSE, SYMBOLS, PARENT, MEMBER_RESOLVE, POST_TYPE_RESOLVE, VALIDATE, ADDITIONAL_VALIDATE, REFERENCES, EMIT

      The first attempt was to parse Apex code like pmd-apex, using the ApexCompiler up to the ADDITIONAL_VALIDATE stage with some workaround to avoid validation failures ( CompilerService.java#L105 )
      This attempt was a failure, we learned that it was not possible to retrieve token locations from the final AST produced by ApexCompiler. The information is lost during the conversion from the first AST into the final AST. Furthermore, the final AST reflects more the code that will be executed than the source code, there are some additional generated methods like <clinit>, <init> and clone.

      Current usage of Apex front-end

      Currently, SonarApex to only use ApexParser that produce the first AST without semantic information.

      Investigations

      During this innovation sprint, we will continue to investigate "how to capitalize on semantic information produced by ApexCompiler", and answer the following questions:

      • Can we execute ApexCompiler and capitalize on provided semantic information?
      • Does the new native Apex front-end, including "apex.jorje.semantic" package, has an acceptable size to be embedded into SonarApex?
      • Does semantic information is suitable to implement the rules listed in the above WHAT section?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                nicolas.harraudeau Nicolas Harraudeau
                Reporter:
                nicolas.harraudeau Nicolas Harraudeau
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: