Uploaded image for project: 'Product Roadmaps'
  1. Product Roadmaps
  2. MMF-993

Provide a WS to monitor the status of a SQ instance or cluster

    XMLWordPrintable

    Details

    • Type: MMF
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Labels:

      Description

      The Problem

      As an Ops, when I install and run either a standalone SonarQube instance or a SonarQube cluster, I don't have any way to make sure that SQ is fully operational. At any point in time, I should be able to monitor the SQ global status through a WS call:

      • Green: Fully working
      • Yellow: Working but something must be fixed to make SQ being fully operational
      • Red: Not working

      The Solution

      Health

      A new dedicated /api/system/health Web Service should provide the health of SonarQube:

      • In case of a standalone SonarQube instance, the health is:
        • Green when:
          • the instance is up (Web, CE and ES processes are up)
          • and the instance is connected to the DB
          • and ES is green
        • Yellow when ES is yellow
        • Red in the other cases
      • In case of a SonarQube cluster:
        • Green when:
          • there are at least 2 application nodes
          • and there is an odd number and at least 3 search nodes
          • and all nodes are green
          • and ES cluster is green
        • Yellow when:
          • there is only 1 Application node
          • or there is an even number or less than 3 Search nodes
          • or 1+ application node is yellow/red but at least 1 application node is yellow/green
          • or 1+ search node is yellow/red but at least 2 search nodes are yellow/green
          • or ES cluster is yellow
        • Red in the other cases

      If the global health is yellow or red, we expect to get a short explanation why. The explanation can describe a combination of the following causes:

      • In case of a standalone SonarQube instance:
        • Red:
          • the instance is down, starting, or waiting for a DB migration to be completed
          • the instance is not connected to the DB
          • ES status is red
        • Yellow:
          • ES status is yellow
      • In case of a SonarQube cluster:
        • Red:
          • no application node
          • no or only 1 search node
          • all application nodes are red
          • all search nodes are red
          • ES cluster status is red
        • Yellow:
          • only 1 application node
          • only 2 or an even number of ES nodes
          • 1 application node is yellow/red but at least 1 other application node is yellow/green.
          • 1 search node is yellow/red but at least 2 other search nodes are yellow/green.
          • ES cluster status is yellow

      Also, in case of a SonarQube cluster, for each active node, we expect to get the following information:

      • IP address (sonar.cluster.node.host)
      • Port (sonar.cluster.node.port)
      • Node name if any (sonar.cluster.node.name)
      • Node type: Application or Search (sonar.cluster.node.type)
      • Time of the last start
      • Node health:
        • For an Application node:
          • Green when:
            • the node is up (Web and CE processes are up)
            • and the node is connected to the DB
          • Red in the other cases
        • For an Search node:
          • Green when the node is up (ES process is up)
          • Red in the other cases
      • A short explanation if the node health is red. The explanation can describe a combination of the following issues:
        • the node is down, starting, or waiting for a DB migration to be completed
        • the node is not connected to the DB

      Since, in case of a SonarQube cluster, this WS provides information about the typology, it needs to be secured.
      But as this WS is not at all a functional one, will be called very often and its behavior should not depend on the good collaboration between all the SonarQube components (DB, ES, Application nodes), it should be possible for an ops to use a very light dedicated authentication mechanism to access to it. In that case, this WS will accept the header X-Sonar-Passcode. This passcode should match the one provided in each sonar.properties configuration file with help of the sonar.web.systemPasscode parameter.
      Still, since it could be useful for an ops to be able to use it while being logged in SonarQube, this WS will also support standard SonarQube authentication. The WS requires a user to have system administration rights on a standalone SonarQube and to be root on a SonarQube cluster:.

      About The Design

      The idea is to be able get at any point in time the active nodes and their relating status from any Web Server, without depending on synchronous requests towards each nodes. We will rely on Hazelcast as a shared memory:

      • Each node (sonar-application) stores its status each 10 seconds in Hazelcast (with time, IP/port and last start time).
      • To know the status of cluster, the server which is interrogated gets from Hazelcast the list of active nodes with their status. The status of a node is not considered as valid if hasn't been updated in the past 20 seconds.

      Other

      To remain backward compatbile, the WS api/system/status will keep providing:

      • in case of a standalone SonarQube instance: the start-up status of the instance, with for values: starting / up / down / restarting / db_migration_needed / db_migration_running.
      • in case of a SonarQube cluster: the start-up status of the application node which is queried.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              christophe.levis Christophe Levis
              Reporter:
              freddy.mallet Freddy Mallet (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: