Introducing PredictKube - an AI-based predictive autoscaler for KEDA made by Dysnix
Daniel Yavorovych (Dysnix), Yuriy Khoma (Dysnix), Zbynek Roubalik (KEDA), Tom Kerkhove (KEDA)
February 14, 2022
Introducing PredictKube—an AI-based predictive autoscaler for KEDA made by Dysnix
Dysnix has been working with high-traffic backend systems for a long time,
and the efficient scaling demand is what their team comes across each day.
The engineers have understood that manually dealing with traffic fluctuations and preparations of infrastructure is
inefficient because you need to deploy more resources before the traffic increases,
not at the moment the event happens. This strategy is problematic for two reasons: first, because it’s often too late to scale when traffic has already arrived and second, resources will be overprovisioned and idle during the times that traffic isn’t present.
And when it comes to deciding how to wrap up this solution, Dysnix decided to rely on KEDA as it is the most
universal and applicable component for application autoscaling in Kubernetes.
KEDA is being used as a component on the client side of PredictKube that is responsible for transferring requests
and scaling replicas.
Dysnix’s PredictKube integrates with KEDA
Dysnix has built PredictKube, a solution that can be used as a KEDA scaler that
is responsible for resource balancing, and an AI model that has learned to react proactively to patterns of traffic activity,
to help with both in-time scaling and solving the problem of overprovision.
The predictive autoscaling process is possible thanks to an AI model that observes the requests-per-second (RPS)
or CPU values for a period of time during a project and then shows the trend for up to 6 hours.
PredictKube used customer and open data sources (we used data sets like HTTP NASA logs) to train the model and be specific about the cloud data and traffic trends.
With this tool, Dysnix wants to decrease costs on the projects, analyze the data about traffic more efficiently,
use cloud resources more responsibly, and build infrastructures that are "greener" and more performative
(with fewer downtimes and delays) than others.
How does PredictKube work?
PredictKube works in two parts:
On the KEDA side
The interface connects via API to the data sources about your traffic.
PredictKube uses Prometheus—the industry standard for storing metrics.
There, it anonymizes the data about the traffic on the client’s side before sending it to the API,
where the model then works with information that is completely impersonal.
On the AI model side
Next, it is linked with a prediction mechanism—the AI model starts to get data about things that happen in
your cloud project. Unlike standard rules-based algorithms such as Horizontal Pod Autoscaling (HPA),
PredictKube uses Machine Learning models for time series data predicting, like CPU or RPS metrics.
The more data you can provide to it from the start, the more precise the prediction will be. The 2+ weeks data will be enough for the beginning.
The rest is up to you! You can visualize the trend of prediction in, for example, Grafana.
To check the configuration and status of the scaling created in the previous step, use the following command:
$ kubectl get scaledobject example
To get stats to use for the scaling, use the following command:
$ kubectl get hpa example
Now you can look at how scaling works at a graph in your visualization tool.
This is an example of a graph Dysnix gets in one of their projects after using PredictKube:
On this graph, you’ll see the graphs of stats for the environment with 2 hours cooldown period.
The green trend shows predicted replicas number, a yellow one—ready replicas at a certain moment,
and the ideal—the blue trend—showing the closest replicas number covering the RPS trend.
If you need a template of such a dashboard to make your own, feel free to contact Daniel to get one.
After everything is connected and deployed, you’ll be able to change the time frame you’re observing or just monitor the data as it comes.
With this release, Dysnix has created the first milestone of predictive autoscaling for Kubernetes workloads.
The team hopes you’ll find it interesting and help to test it and improve it.
If you have any questions to ask about the core functionality of PredictKube,
you may contact the developers’ team here.
And for all KEDA-related issues, share your feedback via GitHub.
In the future, PredictKube plans to add more integrations with other data sources to autoscale
based on other configurations of the projects. Also, there is an idea for implementing
an event-based predictive scaling to make it possible to react on not only a trend but event appearance.
You can contact the Dysnix team with any questions concerning the mechanics of PredictKube
The following people will be happy to help: