If a task is put back into pending status, the compute engine will attempt to process it again.
One reason a task is put back into pending status is that the processing of the task makes the Compute Engine JVM fail (OutOfMemory) or its processing takes an infinite time. Since the worker which processed this task is gone, the task will be reset to pending.
In both those case, chances are very high the next Compute Engine worker attempting to process this task one more time will suffer the same problem.
In several use cases, the retry mechanism has unwanted side effects.
A task causes an OOM in the Compute Engine and the JVM stops (call it CE1). In cluster mode, with the resilience mechanism, the other CE JVM (call it CE2) will reset this task to pending. If the CE2 then attempts to process this task again, it will also end up in OOM and the Compute Engine will stop processing any task.
A task takes too long to process and the pending tasks stack up. In cluster mode, ops can kill the Compute Engine JVM blocked processing this task. With the resilience mechanism, the same or another CE will process the task again and also get stuck on it.
And, side effects also happen in standalone mode:
For the 2 cases described previously, when SonarQube restarts, the Compute Engine will fail / be stuck again processing the task.
The retry feature is limited to two attempts (hardcoded value). But this is not enough.
No task should be processed more than once, even when we are not in cluster mode. Contamination effect should be avoided at all cost.
It is accepted that some non offending tasks may be failed during planned restarts / upgrades of SonarQube.
We will then disable retries. (Under the hood, we'll set the maximum of retries to 0).