You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/xks/operator-guide/kubernetes/aks.md
+89-27
Original file line number
Diff line number
Diff line change
@@ -5,17 +5,19 @@ title: AKS
5
5
6
6
import useBaseUrl from '@docusaurus/useBaseUrl';
7
7
8
-
## System Node Pool
8
+
## Nodes
9
+
10
+
### System Pool
9
11
10
12
AKS requires the configuration of a system node pool when creating a cluster. This system node pool is not like the other additional node pools. It is tightly coupled to the AKS cluster. It is not
11
-
possible without manual intervention to change the instance type or taints on this node pool without recreating the cluster. Additionally the system node pool cannot scale down to zero, for AKS to
12
-
work there has to be at least one instance present. This is because critical system pods like Tunnelfront and CoreDNS will by default run on the system node pool. For more information about AKS
13
+
possible without manual intervention to change the instance type or taints on this node pool without recreating the cluster. Additionally the system node pool cannot scale down to zero. For AKS to
14
+
work there has to be at least one instance present. This is because critical system pods like Tunnelfront or Konnectivity and CoreDNS will by default run on the system node pool. For more information about AKS
13
15
system node pool refer to the [official documentation](https://docs.microsoft.com/en-us/azure/aks/use-system-pools#system-and-user-node-pools).
14
16
15
17
XKS follows the Azure recommendation and runs only system critical applications on the system node pool. Doing this protects services like CoreDNS from starvation or memory issues caused by user
16
18
applications running on the same nodes. This is achieved by adding the taint `CriticalAddonsOnly` to all of the system nodes.
17
19
18
-
### Sizing Nodes
20
+
####Sizing
19
21
20
22
Smaller AKS clusters can survive with a single node as the load on the system applications will be moderately low. In larger clusters and production clusters it is recommended to run at least three
21
23
system nodes that may be larger in size. This section aims to describe how to properly size the system nodes.
@@ -25,7 +27,7 @@ types which have a balance of CPU and memory resources. A good starting point is
25
27
26
28
More work has to be done in this area regarding sizing and scaling of the system node pools to achieve a standardized solution.
27
29
28
-
### Modifying Nodes
30
+
####Modifying
29
31
30
32
There may come times when Terraform wants to recreate the AKS cluster when the system node pool has been updated. This happens when updating certain properties in the system node pool. It is still
31
33
possible to do these updates without recreating the cluster, but it requires some manual intervention. AKS requires at least one system node pool but does not have an upper limit. This makes it
@@ -41,32 +43,107 @@ az aks nodepool add --cluster-name aks-dev-we-aks1 --resource-group rg-dev-we-ak
41
43
> It may not be possible to create a new node pool with the current Kubernetes version if the cluster has not been updated in a while. Azure will remove minor versions as new versions are released. In
42
44
> that case you will need to upgrade the cluster to the latest minor version before making changes to the system pool, as AKS will not allow a node with a newer version than the control plane.
43
45
44
-
Delete the system node pool created by Terraform:
46
+
Delete the system node pool created by Terraform.
45
47
46
48
```shell
47
49
az aks nodepool delete --cluster-name aks-dev-we-aks1 --resource-group rg-dev-we-aks --name default
48
50
```
49
51
50
-
Create a new node pool with the new configuration. In this case it is setting a new instance type and adding a taint:
52
+
Create a new node pool with the new configuration. In this case it is setting a new instance type and adding a taint.
51
53
52
54
```shell
53
55
az aks nodepool add --cluster-name aks-dev-we-aks1 --resource-group rg-dev-we-aks --name default --mode "System" --zones 1 2 3 --node-vm-size "Standard_D2as_v4" --node-taints "CriticalAddonsOnly=true:NoSchedule"
54
56
--node-count 1
55
57
```
56
58
57
-
Delete the temporary pool:
59
+
Delete the temporary pool.
58
60
59
61
```shell
60
62
az aks nodepool delete --cluster-name aks-dev-we-aks1 --resource-group rg-dev-we-aks --name temp
61
63
```
62
64
63
65
For additional information about updating the system nodes refer to [this blog post](https://pumpingco.de/blog/modify-aks-default-node-pool-in-terraform-without-redeploying-the-cluster/).
64
66
65
-
## Update AKS cluster
67
+
### Worker Pool
68
+
69
+
Worker node pools are all other node pools in the cluster. The main purpose of the worker node pools are to run application workloads. They do not run any system critical Pods. However they will run
70
+
system Pods if they are deployed from a Daemonset, this includes applications like Kube Proxy and CSI drivers.
71
+
72
+
All node pools created within XKF will have autoscaling enabled and set to scale across all availability zones in the region. These settings cannot be changed, it is however possible to set a static
73
+
amount of instances by specifying the min and max count to be the same. XKF exposes few settings to configure the node instances. The main one being the instance type, min and max count, and
74
+
Kubernetes version. Other non default node pool settings will not be exposed as a setting as XKF is a opinionated solution. This means at times that default settings can be changed in the future.
75
+
76
+
## Disk Type
77
+
78
+
XKF makes an opinionated choice with regards to the disk type. AKS has the option of either using managed disks och ephemeral storage. Managed disks offer the simplest solution, they can be sized
79
+
according to requirements and are persisted across the whole nodes life cycle. The downside of managed disks is that the performance is limited as the disks are not located on the hardware. Disk
80
+
performance is instead based on the size of the disk. The standard size used by AKS for the managed OS disk is 128 GB which makes it a
81
+
[P10](https://azure.microsoft.com/en-us/pricing/details/managed-disks/) disk that will max out at 500 IOPS. It is important to remember that the OS disk is used by all processes. Pulled OCI binaries,
82
+
container logs, and ephemeral Kubernetes volumes. All these processes will share the same disk performance. An application that for example writes large amount of requests, logs every HTTP request,
83
+
can consume large amounts of IOPS as logs written to STDOUT will be written to disk. Another smaller downside with managed disks is that the disks are billed per GB on top of the VM cost, this
84
+
represents a very small percentage of the total AKS cost.
85
+
86
+
Ephemeral storage on the other hand offer higher IOPS out of the box at the cost of not persisting data and increased dependency on the VM type. This storage type uses the cache disk on the VM as
87
+
storage for the OS and other kubelet related resources. The size of the cache will vary based on the VM type and size, meaning that different node pools may have different amounts of available storage
88
+
for example ephemeral volumes. A general rule is however that the [cache disk has to be at least 30GB](https://docs.microsoft.com/en-us/azure/aks/cluster-configuration#use-ephemeral-os-on-existing-clusters)
89
+
which removes some of the smallest VM sizes from the pool of possibilities. Remember that a cache disk of 30GB does not mean 30GB of free space as the OS will consume some of that space. It may be
90
+
wise to lean towards fewer larger VMs instead of more smaller VMs to increase the amount of disk available.
91
+
92
+
> VMs will on top of the cache disk also come with a temporary disk. This is an additional disk which is also local to the VM which shares the IOPS with the cache. A preview feature in AKS is to use
93
+
the temporary disk as the storage volume for the kubelet. This feature can be enabled with
94
+
[kubelet_disk_type](http://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster_node_pool#kubelet_disk_type) and will most likely be used as fast as it it out of
95
+
preview in AKS.
96
+
97
+
Instance type availability is not properly documented currently, partly because the feature is relatively new. Regional differences has been observed where ephemeral VMs may be available in one region
98
+
but not the other for the same VM type and size. There is no proper way currently to determine which regions are available, instead this has to be done through trial and error. The same can be said
99
+
about the cache instance size. Some instance types have the cache size documented others do not, but will still work. Check the [VM sizes](https://docs.microsoft.com/en-us/azure/virtual-machines/sizes)
100
+
documentation for availability information first. The cache size is given as the value in the parentheses in the "Max cached and temp storage throughput" column.
66
101
67
-
### Useful commands in Kubernetes
102
+
The following VM sizes have been verified to work with ephemeral disks in the West Europe region. Observe that this may not be true in other regions.
103
+
104
+
| VM | Cache Size |
105
+
| --- | --- |
106
+
| Standard_D4ds_v4 | 100GB |
107
+
| Standard_E2ds_v4 | 50GB |
108
+
| Standard_E4ds_v4 | 100GB |
109
+
| Standard_F8s_v2 | 128GB |
110
+
111
+
Being aware of the cache size is important because the OS disk size has to be specified for each. The default value of 128 GB may be larger than the available cache, in that case the VM creation will
112
+
fail. The OS disk size should be the same as the cache size as there is no other use for the cache size other than the OS. An alternative method of figuring out the max cache size is to use
113
+
[this solution](https://www.danielstechblog.io/identify-the-max-capacity-of-ephemeral-os-disks-for-azure-vm-sizes/) which adds an API to query. Some testing of this API has however resulted in the
114
+
finding that the data is not valid for all VM types, and some VM types that do support ephemeral disks do not show up.
115
+
116
+
### Sizing
117
+
118
+
Choosing a starting point for the worker node pool can be difficult. There are a lot of factors that will affect instance types choice which are not even limited to only memory or CPU consumption. An
119
+
optimal may even include multiple node pools of different types to serve all needs. Unless there are prior analysis the best starting point will be a single node pool with a general instance type.
120
+
121
+
```
122
+
additional_node_pools = [
123
+
{
124
+
name = "standard1"
125
+
orchestrator_version = "<kubernetes-version>"
126
+
vm_size = "Standard_D2ds_v4"
127
+
min_count = 1
128
+
max_count = 3
129
+
node_labels = {}
130
+
node_taints = []
131
+
os_disk_type = "Ephemeral"
132
+
os_disk_size_gb = 50
133
+
spot_enabled = false
134
+
spot_max_price = null
135
+
},
136
+
]
137
+
```
68
138
69
-
When patching an AKS cluster or just upgrading nodes it can be useful to watch your resources in Kubernetes.
139
+
### Modifying
140
+
141
+
### Spot Instances
142
+
143
+
144
+
## FAQ
145
+
146
+
### When patching an AKS cluster or just upgrading nodes it can be useful to watch your resources in Kubernetes.
70
147
71
148
```shell
72
149
# Show node version
@@ -78,22 +155,7 @@ watch kubectl get nodes
78
155
# Check the status of all pods in the cluster
79
156
kubectl get pods -A
80
157
```
81
-
82
-
### Terraform update Kubernetes version
83
-
84
-
TBD
85
-
86
-
### CLI update Kubernetes version
87
-
88
-
```shell
89
-
export RG=rg1
90
-
export POOL_NAME=default
91
-
export CLUSTER_NAME=cluster1
92
-
export AZURE_LOCATION=westeurope
93
-
export KUBE_VERSION=1.21.9
94
-
```
95
-
96
-
What AKS versions can I pick in this Azure location:
158
+
### What AKS versions can I pick in this Azure location:
97
159
98
160
```shell
99
161
az aks get-versions --location $AZURE_LOCATION -o table
0 commit comments