Uploaded image for project: 'SonarQube'
  1. SonarQube
  2. SONAR-11814

Speed-up Compute Engine when persisting measures on PostgreSQL

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.7
    • Component/s: Compute Engine
    • Labels:
      None
    • Edition:
      Community
    • Production Notes:
      None

      Description

      What

      The Compute Engine step that "upserts" the DB table live_measures is currently the biggest hotspot, as described by statistics on 1'816 recent analyses on SonarCloud, whatever the branch type (long/short/PR):

       

      • total duration (for all analyses)
      Persist live measures                             : 164 minutes
       Execute component visitors                        : 39 minutes
       Index analysis                                    : 15 minutes
       Send issue notifications                          : 13 minutes
       Load quality profiles                             : 11 minutes
       Compute new coverage                              : 9 minutes
       Persist sources                                   : 6 minutes
       Persist issues                                    : 5 minutes
       Extract report                                    : 3 minutes
       Purge db                                          : 2 minutes
       Checks executed after computation of measures     : 2 minutes
       Compute Quality Profile status                    : 1 minutes
       Verify billing                                    : 1 minutes
       Build tree of components                          : 0 minutes
       Persist new ad hoc Rules                          : 0 minutes
      • 98-percentile duration
      Persist live measures                             : 60 seconds
      Execute component visitors                        : 13 seconds
      Index analysis                                    : 5 seconds
      Send issue notifications                          : 4 seconds
      Compute new coverage                              : 3 seconds
      Persist sources                                   : 2 seconds
      Persist issues                                    : 2 seconds
      Extract report                                    : 0 seconds
      Purge db                                          : 0 seconds
      Checks executed after computation of measures     : 0 seconds
      Compute Quality Profile status                    : 0 seconds
      Load quality profiles                             : 0 seconds
      Persist new ad hoc Rules                          : 0 seconds
      Persist components                                : 0 seconds
      Compute coverage measures                         : 0 seconds
      • by max duration
      Persist live measures                             : 380 seconds
       Execute component visitors                        : 96 seconds
       Send issue notifications                          : 82 seconds
       Index analysis                                    : 36 seconds
       Compute new coverage                              : 28 seconds
       Persist issues                                    : 17 seconds
       Persist new ad hoc Rules                          : 11 seconds
       Persist sources                                   : 10 seconds
       Compute Quality Profile status                    : 9 seconds
       Purge db                                          : 7 seconds
       Extract report                                    : 5 seconds
       Checks executed after computation of measures     : 3 seconds
       Persist components                                : 3 seconds
       Build tree of components                          : 3 seconds
       Execute DB migrations for current project         : 2 seconds

      How

      This article is quite old but it showed that multi-row inserts are a good solution to speed-up massive insertions. Based on this idea, I did some local benchmarks on a significant Java project (15.5k measures) and PostgreSQL 10:

      • batch upserts (as it's currently done): 6.7 seconds
      • ~50-row upserts, with batch: 2.3 seconds. Note that the number of rows per upsert is the number of measures on the component (between 20 and 75).
      • ~50-row upserts, without batch: 2.3 seconds

      Speed-up is confirmed (-65%).

      When dogfooding the fix on next.sonarqube.com (PostgreSQL 10.5), the CE analysis of the master of sonar-enterprise went from ~6 minutes to ~3.5 minutes (see attached screenshot). Duration of pull requests is not changed (as fast as before!).

      This improvement is specific to PostgreSQL and does not apply to Oracle, SQLServer, MySQL or H2.

        Attachments

          Activity

            People

            • Assignee:
              simon.brandhof Simon Brandhof (Inactive)
              Reporter:
              simon.brandhof Simon Brandhof (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved: