K3s startup sequencing is inefficient #11896

brandond · 2025-03-06T20:51:45Z

Some K3s components prefer to wait until other components are up and ready before running. For example:

The kubelet does not start until the container runtime is up.
The kubelet also does not start until the apiserver is up when using the K3s embedded executor. RKE2 does this differently to host static pods.
The cloud-controller and controller-manager do not start until the apiserver is up.
The scheduler does not start until the apiserver is up and there is a schedulable node.
The agent doesn't start polling for endpoint updates until the apiserver is up.

In some cases where there is a hard dependency this is required to keep things from crashing, in other cases it is just done to avoid excessive log spew as things retry.

The current way that waits are done is inefficient, and could be improved:

We have a helper to wait on apiserver readiness, but we call it from four separate places, instead of having a single readiness channel that things can block on:

k3s/pkg/daemons/executor/embed.go

Lines 83 to 85 in d694dd1

    
           if err := util.WaitForAPIServerReady(ctx, e.nodeConfig.AgentConfig.KubeConfigKubelet, util.DefaultAPIServerReadyTimeout); err != nil { 
        
           	logrus.Fatalf("Kubelet failed to wait for apiserver ready: %v", err) 
        
           }

k3s/pkg/agent/tunnel/tunnel.go

Lines 95 to 97 in d694dd1

    
           if err := util.WaitForAPIServerReady(ctx, config.AgentConfig.KubeConfigKubelet, util.DefaultAPIServerReadyTimeout); err != nil { 
        
           	logrus.Fatalf("Tunnel watches failed to wait for apiserver ready: %v", err) 
        
           }

k3s/pkg/agent/run.go

Lines 177 to 179 in d694dd1

    
           if err := util.WaitForAPIServerReady(ctx, nodeConfig.AgentConfig.KubeConfigKubelet, util.DefaultAPIServerReadyTimeout); err != nil { 
        
           	return pkgerrors.WithMessage(err, "failed to wait for apiserver ready") 
        
           }

k3s/pkg/daemons/control/server.go

Lines 465 to 469 in d694dd1

    
           case err := <-promise(func() error { return util.WaitForAPIServerReady(ctx, runtime.KubeConfigSupervisor, 30*time.Second) }): 
        
           	if err != nil { 
        
           		logrus.Infof("Waiting for API server to become available") 
        
           		continue 
        
           	}

The apiserver readiness helper doesn't currently provide any feedback on why the apiserver is not ready yet, due to a misunderstanding about how error handling works within the RestClient library.

k3s/pkg/util/api.go

Lines 86 to 89 in d694dd1

    
           if rerr := result.Error(); rerr != nil { 
        
           	lastErr = pkgerrors.WithMessage(rerr, "failed to get apiserver /readyz status") 
        
           	return false, nil 
        
           }

The server tries to wait on container runtime readiness before starting etcd (for RKE2 reasons), but this is currently done by passing a channel from a Server config struct to the Agent config struct, which the agent closes if it is non-nil to signal that the runtime is ready.

k3s/pkg/cli/server/server.go

Line 545 in fb870ad

agentConfig.ContainerRuntimeReady = containerRuntimeReady

k3s/pkg/agent/run.go

Lines 162 to 168 in fb870ad

    
           // the container runtime is ready to host workloads when containerd is up and the airgap 
        
           // images have finished loading, as that portion of startup may block for an arbitrary 
        
           // amount of time depending on how long it takes to import whatever the user has placed 
        
           // in the images directory. 
        
           if cfg.ContainerRuntimeReady != nil { 
        
           	close(cfg.ContainerRuntimeReady) 
        
           }

k3s/pkg/etcd/etcd.go

Lines 499 to 500 in fb870ad

    
           	logrus.Infof("Waiting for container runtime to become ready before joining etcd cluster") 
        
           case <-e.config.Runtime.ContainerRuntimeReady:

All of this readiness stuff should be moved into the Executor interface. The Executor is responsible for running things, it should also provide functions that return channels which can be read to indicate when various components started by the Executor are up.

With this done, several places that currently block startup of other functions due to difficulty of passing around wait channels can be made async, and simply block on readiness were necessary.

github-project-automation bot added this to K3s Development Mar 6, 2025

github-project-automation bot moved this to New in K3s Development Mar 6, 2025

brandond changed the title ~~K3s startup sequencing is inefficient and confusing~~ K3s startup sequencing is inefficient Mar 6, 2025

brandond moved this from New to Working in K3s Development Mar 6, 2025

brandond moved this from Working to Peer Review in K3s Development Mar 6, 2025

brandond added this to the 2025-04 Release Cycle milestone Mar 6, 2025

brandond self-assigned this Mar 6, 2025

brandond mentioned this issue Mar 6, 2025

Improve readiness polling on node startup #11878

Merged

This was referenced Mar 31, 2025

[Release-1.32] - K3s startup sequencing is inefficient #12032

Open

[Release-1.31] - K3s startup sequencing is inefficient #12033

Open

[Release-1.30] - K3s startup sequencing is inefficient #12034

Open

This was referenced Apr 9, 2025

Move Windows test out of Install Script workflow #12087

Closed

Move Windows test out of Install Script workflow #12089

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K3s startup sequencing is inefficient #11896

K3s startup sequencing is inefficient #11896

brandond commented Mar 6, 2025 •

edited

Loading

K3s startup sequencing is inefficient #11896

K3s startup sequencing is inefficient #11896

Comments

brandond commented Mar 6, 2025 • edited Loading

brandond commented Mar 6, 2025 •

edited

Loading