How to scale Rasa X to handle more concurrent users

Rasa X: 0.39.2 Rasa: 2.6.0-full

Server:

4 Core 16GiB Ram 150GB SSD

During testing we saw one of the rasa x components become CPU bound. Which limits performance. I do not properly recall which component.

The main questions to the rasa x team is: How to scale Rasa X to handle more concurrent users?

We have use Helm to install Rasa X on Kubernetes. Which deployments or statefulset needs to be scaled up? And does this require us to reconfigure anything else besides scaling out more pods?

Thanx in advance!

You can increase the number of replicas for your pods. In my opinions, these are the most useful pods to replicate:

Service Role
rasa-x Running the HTTP API
rasa-production Running a trained model, parsing intents, predicting actions
rasa-worker Training and evaluating models

You can learn more in the Rasa Advanced Deployment Workshop .