Uploaded image for project: 'SonarQube'
  1. SonarQube
  2. SONAR-11077

Logs should help better profile Compute Engine performance hotspots

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.7.5, 7.3
    • Component/s: None
    • Labels:
    • Edition:
      Community

      Description

      WHY

      When ops or services have to investigate on a potential performance issue, they can face some pains in profiling the performance on Compute Engine i.e. to understand where the processing time is spent and why some tasks take longer than others.
      The main pains:
      In a nutshell , it's not clear from the logs why an analysis takes longer than another one.

      • We are missing the information about the type of branch analyzed
      • There's no indication about the size of the analyzed report
      • and also more generally about the context in which this report is processed (project size, number of issues)
      • There's no simple way to spot or exclude DB or ES latency - no way to simply see where the time is spent and to check if this time is normal in regards of the analysis. The alternative is to activate TRACE logs and to manually compute the number of requests and the time spent in requests.
      • In cluster mode (DCE), it can be hard to identify from the logs if / why a node and a specific process has a problem.
        There's no indication whether the logs are coming from a node or another and whether the logs are linked to the actual process that takes CPU resources.

      WHAT

      The Compute Engine should provide more information in the logs to help understanding why and where the computing time is spent:

      • the nature of the analysis
      • the report which is processed
      • and the processing itself

      We expect these information to be available both in 6.7.5 LTS and 7.3 versions.
      Out of the scope, we can think, for 7.x, about providing for each analysis the full context of the analysis (in addition to the scanner context).

      • CE logs should clearly show whether the processed report corresponds to a long-lived branch, a short-lived branch or a PR.
        If this type is not known when the CE starts computing a task, the information should be dumped in an additional INFO log.
      • CE logs should display contextual information info about:
        • the report in general
          • report size (uncompressed)
          • number of folders
          • number of files
          • number of lines of code
          • number of issues
        • and, more specifically about the notification step:
          • the number of channels
          • the number of subscribers
          • the number of issues that were updated
            Ex: | subscribers=580 issues=73
      • Some CE logs are useless at DEBUG level and should either moved to the TRACE level or muted:
        Ex:
        2018.07.18 15:57:44 DEBUG ce[AWStrrL0LmHV6wsE53sW][governance] Retrieved 58 measures for ProjectView Sample Web Application (key=Alltest:sample-web-application, uuid=AV7o_b_ZcGX7j33GR2Sa) from project AV7o_b_kcGX7j33GR2Sc)
        
        2018.07.18 15:57:45 DEBUG ce[AWStrrL0LmHV6wsE53sW][o.a.f.u.ContentHandlerFactoryRegistry] Dynamically adding ContentHandlerFactory: org.apache.fop.render.afp.extensions.AFPExtensionHandlerFactory
        

        and other logs from o.a.f.u.ContentHandlerFactoryRegistry, o.a.f.r.RendererFactory, o.a.f.r.XMLHandlerRegistry, o.a.f.r.ImageHandlerRegistry, o.a.x.i.l.s.ImageImplRegistry

      • Related: Some Web logs also generate noise and should be muted
        Ex:
        web[][c.h.i.p.o.PartitionStateOperation] [10.0.2.11]:9021 [SonarQube] [3.8.6] Applied new partition state. Version: 4548, caller: [10.0.2.11]:9052
        
        web[][c.h.i.c.ClusterService] [10.0.2.11]:9021 [SonarQube] [3.8.6] Setting cluster time diff to -2ms.
        web[][c.h.i.c.i.ClusterHeartbeatManager] [10.0.2.11]:9021 [SonarQube] [3.8.6] Received heartbeat from Member [10.0.2.11]:9051 - 804d64f3-362f-43f0-acfe-eb165ca6efb9 (now: 2018-07-31 12:32:17.013, timestamp: 2018-07-31 12:32:17.013)
        

        and other logs from c.h.i.p.o.PartitionStateOperation, c.h.r.i.o.ReplicationOperation, c.h.i.p.InternalPartitionService, c.h.r.i.o.RequestMapDataOperation, c.h.r.i.o.SyncReplicatedMapDataOperation, c.h.i.c.i.ClusterHeartbeatManager, c.h.i.c.ClusterService

      • CE steps and corresponding timing info are currently written in DEBUG level. It would helpful, starting in 7.3, to constantly get this info in production to get a better idea of what is currently happening for a task in progress.
        With SonarQube 7.3, CE steps logs should be moved at INFO level.
      Out of this ticket
      • CE logs should display detailed timing information info about the steps:
        • the split between the time spent in DB requests and ES requests (the time spent in pure CE processing is the difference with the total duration)
        • the number fo requests
          Ex: | time=170ms db_time=150ms db_req=27 es_time=10ms es_req=3
      • In cluster mode, CE logs should display the node name.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                simon.brandhof Simon Brandhof
                Reporter:
                christophe.levis Christophe Levis
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: