|
| 1 | +https://openkruise.io/blog/learning-concurrent-reconciling/ |
| 2 | + |
| 3 | +# Concurrent Reconciling |
| 4 | + |
| 5 | +When setting up a Kubernetes controller, we have seen that controller supports concurrent reconciliations. |
| 6 | + |
| 7 | +The _MaxConcurrentReconciles_ setting determines the maximum number of concurrent reconciliation processes that can be run at the same time. In other words, it controls how many different instances can be handled for reconciliation in parallel. |
| 8 | + |
| 9 | +```go |
| 10 | +func (r *CleanerReconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manager, |
| 11 | + numOfWorker int, logger logr.Logger) error { |
| 12 | + |
| 13 | + return ctrl.NewControllerManagedBy(mgr). |
| 14 | + For(&appsv1alpha1.Cleaner{}). |
| 15 | + WithOptions(controller.Options{ |
| 16 | + MaxConcurrentReconciles: r.ConcurrentReconciles, |
| 17 | + }). |
| 18 | + Complete(r) |
| 19 | +} |
| 20 | +``` |
| 21 | + |
| 22 | +Imagine a Kubernetes controller responsible for managing a large number of resources. When these resources change frequently, a backlog of reconcile requests can build up. This can lead to delays in ensuring the desired state of your cluster. |
| 23 | + |
| 24 | +Here's where concurrent reconciliation comes in. By running multiple reconcile loops simultaneously, the controller can process these requests much faster. |
| 25 | + |
| 26 | +Now, the million-dollar question: can multiple reconcile loops work on the same resource at the same time? The answer is no. This is guaranteed by [client-go](https://github.com/kubernetes/client-go/blob/a57d0056dbf1d48baaf3cee876c123bea745591f/util/workqueue/queue.go#L65). Quoting from the documentation: |
| 27 | + |
| 28 | +```go |
| 29 | +// * Stingy: a single item will not be processed multiple times concurrently, |
| 30 | +// and if an item is added multiple times before it can be processed, it |
| 31 | +// will only be processed once. |
| 32 | +// * Multiple consumers and producers. In particular, it is allowed for an |
| 33 | +// item to be reenqueued while it is being processed. |
| 34 | +``` |
| 35 | + |
| 36 | +## Workqueue Implementation |
| 37 | + |
| 38 | +The workqueue uses a `queue` and two sets: `dirty` and `processing`. |
| 39 | + |
| 40 | +```go |
| 41 | +// Type is a work queue (see the package comment). |
| 42 | +type Type struct { |
| 43 | + // queue defines the order in which we will work on items. Every |
| 44 | + // element of queue should be in the dirty set and not in the |
| 45 | + // processing set. |
| 46 | + queue []t |
| 47 | + |
| 48 | + // dirty defines all of the items that need to be processed. |
| 49 | + dirty set |
| 50 | + |
| 51 | + // Things that are currently being processed are in the processing set. |
| 52 | + // These things may be simultaneously in the dirty set. When we finish |
| 53 | + // processing something and remove it from this set, we'll check if |
| 54 | + // it's in the dirty set, and if so, add it to the queue. |
| 55 | + processing set |
| 56 | +``` |
| 57 | +
|
| 58 | +The `dirty` set plays a crucial role in deduplication. When a resource needs reconciliation, it's checked against the `dirty` set. If the resource (like our example resource A) is already present, it's not added again, even if it undergoes further changes in a short timeframe. This prevents redundant processing of the same resource for multiple, quick updates. |
| 59 | +
|
| 60 | +This deduplication capability is a core aspect of concurrent reconciliation. It ensures the controller focuses on the actual cluster state, obtained from the apiserver or local cache, rather than reacting to every individual event update. This streamlines the process and avoids unnecessary work. |
| 61 | +
|
| 62 | +<p align="center"> |
| 63 | + <img alt="Queueing to Dirty Set" src="assets/queueing_to_dirty_set.png" width="600"/> |
| 64 | +</p> |
| 65 | +
|
| 66 | +The `processing` set plays a vital role in preventing same resource being reconciled multiple times concurrently. After being added to the `dirty` set, a resource is only added to the `queue` if it's not already present in the `processing` set. The `processing` set essentially tracks resources that are currently undergoing reconciliation. |
| 67 | +
|
| 68 | +Consider this scenario: Imagine a resource named _C_ changes. If _C_ is already in the `processing` set (meaning it's being reconciled), it won't be added to the `queue` again. This avoids the controller from starting multiple reconciliations for the same resource simultaneously. |
| 69 | +
|
| 70 | +On the other hand, if a resource like _D_ changes and isn't currently processing, it gets added to both the `dirty` set and the `queue`. This ensures _D_ gets picked up for reconciliation when a reconcile loop becomes available. |
| 71 | +
|
| 72 | +<p align="center"> |
| 73 | + <img alt="Add to Queue" src="assets/add_to_queue.png" width="600"/> |
| 74 | +</p> |
| 75 | +
|
| 76 | +When the reconciliation process is ready for a new task, it grabs the first one in line from the `queue`. This item is then marked as being worked on by adding it to the `processing` set. Finally, it's removed from the `dirty` set, indicating it's no longer waiting for reconciliation. |
| 77 | +
|
| 78 | +<p align="center"> |
| 79 | + <img alt="Add to Queue" src="assets/pick_next_resource_to_reconcile.png" width="600"/> |
| 80 | +</p> |
| 81 | +
|
| 82 | +In above example, _A_ is picked to be reconciled. _A_ is added to the `processing` set and removed from the `dirty` set. |
| 83 | +
|
| 84 | +## Conclusion |
| 85 | +
|
| 86 | +Concurrent reconciliation is a powerful feature in Kubernetes controllers that can significantly improve performance when managing a large number of frequently changing resources. By running multiple reconcile loops concurrently, the controller can process updates much faster and ensure your deployments reach their desired state more quickly. |
| 87 | +
|
| 88 | +This approach avoids the potential delays caused by a single reconcile queue, but it also implements safeguards to prevent conflicts. The workqueue guarantees that resources are not processed simultaneously, ensuring data consistency and preventing race conditions. |
0 commit comments