When activating new rules, SonarQube will create new issues (during the next analysis) that are not really part of the leak because they were detected on code that has not changed during this period of time. In other words, these are not newly introduced issues per se, but rather existing issues that have been unveiled thanks to the activation of new rules.
This behaviour creates noise and does not match 100% what we push with the Leak metaphor:
- You should make sure that recently added or modified code does not add issues
- Unless new critical issues were found on untouched code, there's no reason to update this code just to fix those newly unveiled issues and take the risk to inject regression on a code that might not be correctly unit tested
A way to remove this noise is to use the SCM date as a creation date to move those "noisy" issues out of the leak period.
Suggested algorithm: when CE has to create a new issue:
- If no SCM information is available, use the analysis date - like this is currently the case
- If "sonar.projectDate" was specified for the analysis, use this date - like this is currently the case
- Even if SCM information is available
- Else, if the rule for that issue was not activated in the previous analysis (i.e. this is a newly activated rule), use the SCM information (on the related line or at file level) for the creation date
- Note: A variant for this case would be to set the creation date of those issues to "Leak-period begin date minus few seconds/minutes" instead of the SCM date - which would actually work even when no SCM information is available
We should check that this algorithm works correctly for all our use cases:
- First analysis
- In this case, the benefit should be huge because almost 100% of the issues will have a date older than the first analysis date, and therefore won't be part of the leak from the very beginning
- Replay the past
- Regular analysis
To be discussed: this algo will not handle correctement the change of quality profile. We have to extend it to take this into account if we want to handle this use case correctly. Like:
- If we detect that the quality profile has changed since previous analysis (this is very easy, there's a event that is generated on QP change), then a first naive approach would be to use the SCM date for all the new issues - whatever the rule.