Cluster | KEDA

Guidance & requirements for running KEDA in your cluster

Warning

You are currently viewing v2.4 of the documentation and it is not the latest. For the most recent documentation, kindly click here.

Requirements

Kubernetes

KEDA is community-supported. For all support options, see the Support page.

KEDA v2.4 has been tested on Kubernetes v1.16.0 or above.

Cluster Capacity

The KEDA runtime require the following resources in a production-ready setup:

Deployment	CPU	Memory
Metrics Server	Limit: 1, Request: 100m	Limit: 1000Mi, Request: 100Mi
Operator	Limit: 1, Request: 100m	Limit: 1000Mi, Request: 100Mi

These are used by default when deploying through YAML.

💡 For more info on CPU and Memory resource units and their meaning, see this link.

Firewall

KEDA requires to be accessible inside the cluster to be able to autoscale.

Here is an overview of the required ports that need to be accessible for KEDA to work:

Port	Why?	Remarks
`443`	Used by Kubernetes API server to get metrics	Required for all platforms because it uses Control Plane → port 443 on the Service IP range communication. This is not applicable for Google Cloud.
`6443`	Used by Kubernetes API server to get metrics	Only required for Google Cloud because it uses Control Plane → port 6443 on the Pod IP range for communication

High Availability

KEDA does not provide full support for high-availability due to upstream limitations.

Here is an overview of all KEDA deployments and the HA notes:

Deployment	Support Replicas	Note
Metrics Server	1	You can run multiple replicas of our metrics sever, and it is recommended to add the `--enable-aggregator-routing=true` CLI flag to the kube-apiserver so that requests sent to our metrics servers are load balanced. However, you can only run one active metric server in a Kubernetes cluster serving external.metrics.k8s.io which has to be the KEDA metric server.
Operator	2	While you can run multiple replicas of our operator, only one operator instance will be active. The rest will be standing by, which may reduce downtime during a failure. Multiple replicas will not improve the performance of KEDA, it could only reduce a downtime during a failover.This is only supported as of KEDA v2.6 if you are using our Helm chart.

HTTP Timeouts

Some scalers issue HTTP requests to external servers (i.e. cloud services). Each applicable scaler uses its own dedicated HTTP client with its own connection pool, and by default each client is set to time out any HTTP request after 3 seconds.

You can override this default by setting the KEDA_HTTP_DEFAULT_TIMEOUT environment variable to your desired timeout in milliseconds. For example, on Linux/Mac/Windows WSL2 operating systems, you’d use this command to set to 1 second:

export KEDA_HTTP_DEFAULT_TIMEOUT=1000

And on Windows Powershell, you’d use this command:

$env:KEDA_HTTP_DEFAULT_TIMEOUT=1000

All applicable scalers will use this timeout. Setting a per-scaler timeout is currently unsupported.

Kubernetes Client Parameters

The Kubernetes client config used within KEDA Metrics Adapter can be adjusted by passing the following command-line flags to the binary:

Adapter Flag	Client Config Setting	Default Value	Description
kube-api-qps	cfg.QPS	20.0	Set the QPS rate for throttling requests sent to the apiserver
kube-api-burst	cfg.Burst	30	Set the burst for throttling requests sent to the apiserver