Affects Version/s: None
Fix Version/s: 8.1
We allow for a complex hierarchy of branches and pull requests. For example, if we have:
- Pull request 'pr1'
- Short lived branch 'slb1'
- Short lived branch 'slb2'
- Long lived branch 'llb1'
And the following chain of targets: 'pr1' -> 'slb1' -> slb2' -> 'llb1'
When analyzing 'pr1', we would consider 'slb1' to be the target branch and 'llb1' to be the merge branch. The idea is that since 'slb1' is not a long lived branch, it only contains some of the files (the changed ones) and therefore we need to have a long lived branch as a fallback for all operations involving the target.
MMF-1786 we simplify this. There is no longer any short lived branches and pull requests can only target branches. Branches always contain all data. So we don't have a fallback branch anymore. We only have a single reference branch.
Scanner sends in the report two fields:
- Target: only used for P/R. Represents the 'raw' value of 'sonar.pullrequest.base' as given by the user. It should only be used in the UI, to show what is the target of the pull request.
- Reference (or merge): Used for both P/Rs and branches. SonarQube will use the reference as a comparison baseline for various operations (see below).
What is the reference
If the branch specified with 'sonar.pullrequest.base' exists and is known by sonarqube, that will be the reference. Otherwise, we use SonarQube's default branch as the reference.
Non main Branches
If it's the first analysis of the branch, we use SonarQube's default branch as the reference. Otherwise, the previous analysis of the same branch will be the reference.
Operations involving the reference branch
Here is a summary of the operations that involve the reference:
Source Lines Diff
, we need to compare the code in the current analysis with the code in the reference to see what changed. We used an algorithm similar to the one used by git (Meyers algorithm). For the lines found as new/changed, we assign as modification date the analysis date. For lines found to be the same, we re-used the known modification date from the reference.
SCM blame information
Collecting SCM blame information in the scanner is very slow. For that reason, , we compare the hash of files being analysed with the files in the reference. If they are the same, we reuse the SCM blame data that is stored for the file in the reference.
To know what files changed while running the scanner, we may need to download the "project repository" of the reference branch to be able to compare hashes in the scanner side.
See details below.
Issue tracking is done differently depending on the situation.
Issue tracking is done in two steps:
- Only keep issues that involves a changed line
- Track those issues against issues from previous analysis of the branch. For matching issues, keep the existing issues of the previous analysis and close issues of the previous analysis that weren't matched.
First analysis of a branch
Do issue tracking against the reference branch (will be default branch in this case). For matching issues, we copy the information of the issue in the reference branch into a new issue (so that they have separate lifecycles from now on).
Other analysis of a branch
Do issue tracking against the reference branch (will be the branch itself in case). For matching issues, keep the existing issues of the previous analysis and close issues of the previous analysis that weren't matched.