Warning
You are currently viewing v0.15 of the documentation and it is not the latest. For the most recent documentation, kindly click here.
Configure Scaling Metrics
Autoscaling metric configuration on an InterceptorRoute
The scalingMetric field on an InterceptorRoute determines what metric drives autoscaling.
You can scale based on concurrent request count, request rate, or both.
At least one metric must be set.
Scale based on the number of in-flight requests per replica:
apiVersion: http.keda.sh/v1beta1
kind: InterceptorRoute
metadata:
name: my-app
spec:
target:
service: <your-service>
port: <your-port>
scalingMetric:
concurrency:
targetValue: 100
The add-on targets targetValue concurrent requests per replica.
When the total concurrent requests across all replicas exceeds replicas * targetValue, KEDA scales up.
| Field | Required | Description |
|---|---|---|
targetValue | Yes | Target concurrent request count per replica. |
Scale based on requests per second, averaged over a sliding window:
apiVersion: http.keda.sh/v1beta1
kind: InterceptorRoute
metadata:
name: my-app
spec:
target:
service: <your-service>
port: <your-port>
scalingMetric:
requestRate:
targetValue: 100
window: 1m
granularity: 1s
| Field | Required | Description |
|---|---|---|
targetValue | Yes | Target requests per second per replica. |
window | 1m | Sliding time window over which the average request rate is calculated. |
granularity | 1s | Bucket size within the window. Smaller granularity gives more responsive scaling at the cost of higher sensitivity to bursts. |
An InterceptorRoute can set both concurrency and requestRate.
KEDA scales to whichever metric demands more replicas.
apiVersion: http.keda.sh/v1beta1
kind: InterceptorRoute
metadata:
name: my-app
spec:
target:
service: <your-service>
port: <your-port>
scalingMetric:
concurrency:
targetValue: 50
requestRate:
targetValue: 200
This is useful when you want to handle both sustained throughput (rate) and bursty traffic (concurrency).
Minimum and maximum replica counts and cooldown are set on the KEDA ScaledObject, not the InterceptorRoute:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: my-app
spec:
scaleTargetRef:
name: <your-deployment>
minReplicaCount: 0 # 0 enables scale-to-zero
maxReplicaCount: 10
cooldownPeriod: 300 # seconds before scaling to zero after traffic stops
Setting minReplicaCount: 0 enables scale-to-zero.
The cooldownPeriod controls how long KEDA waits after the last request before scaling the workload down to zero replicas.
scalingMetric, concurrency, and requestRate.