--goaway-chance
: The Parameter You Should Know to Keep Your Kubernetes API Server Stable π
Symptoms β οΈ
- The most recently started
kube-apiserver
receives very little traffic. - The number of active goroutines on that server is significantly lower than on the others.
- During upgrades or restarts, one server can become overloaded, taking over all the connections from a previously stopped instance, sometimes resulting in OOMKilled errors.π₯
Root Cause π§
The kube-apiserver
uses HTTP/2 and maintains persistent connections. Once established, these connections are never rotated or rebalanced, regardless of the load balancing algorithm being used (round-robin, IP hash, etc.).
As a result, if you have three API servers and one of them restarts, it will not receive any new connections automatically. The already established client connections remain bound to the older servers, leaving the newly restarted one idle. π€
The Solution: --goaway-chance
π οΈ
The --goaway-chance
parameter in kube-apiserver
is a powerful but often overlooked option that can dramatically improve control plane stability in large clusters. π‘
Official Description π
To prevent HTTP/2 clients from getting stuck on a single API server, this flag allows the server to randomly close a connection by sending a
GOAWAY
signal.
Ongoing requests will not be interrupted, and the client will reconnect, often hitting a different API server through the load balancer. π
- Range: from
0
(disabled) to0.02
(2% of requests). - Recommended starting point:
0.001
(0.1%, or 1 in every 1000 requests). - Warning: Do not enable this if you have a single API server or no load balancer in front. β οΈ
Configuration Example π§ͺ
Add the following flag to your kube-apiserver
startup configuration:
--goaway-chance=0.001
Expected Result π
Once enabled, you'll likely observe a balanced load across all your API servers. Here's a simple before/after illustration:
Before:
- server 1: 70% load
- server 2: 30%
- server 3: 0%
After enabling --goaway-chance:
- server 1: 33%
- server 2: 33%
- server 3: 33%
This helps prevent overloads and ensures better resource utilization across your control plane.
On the graph below, we can see the difference before and after applying the parametermage_caption
Conclusion π―
If you're running multiple kube-apiserver
instances behind a load balancer, the --goaway-chance
parameter is an essential tool to ensure load distribution and cluster stability. π
Itβs easy to configure, has minimal impact, and can prevent costly downtimes or crashes during maintenance operations. β±οΈ
Donβt overlook it. πͺ