Deploying RASA in production and scalability

Hi all,

I am trying to deploy RASA and RASA X in production for the first time and I would like to understand the correct way of approaching this.

  1. As I am expecting about 200 concurrent users, scalability is important and I understand the best way of doing this is by deploying using Kubernetes as described here. However, may I check if there is a difference between installing using Server Quick-Install or Helm Chart method? If yes, which one is better?

  2. Is it sufficient to scale when deploying RASA using Docker Compose by adding more containers as and when needed? Deploying using Docker Compose can reduce my server requirements but I am worried about scalability.

  3. I am planning to get RASA Enterprise. How many chatbots can I create using 1 licence? I am trying to estimate the number of licenses needed.

Any guidance would be greatly appreciated.

Thank you!

Hello Jason,

The Quick-Install method is just a shortcut to install Kubernetes, Helm, and the Rasa Helm Chart. So it is just a shortcut for the Helm Chart method with default steps.

To modify your Quick-Install deployment, you can either use environment variables like here, or do the Helm Chart method by updating the values.yml file like here.

Therefore I would say, in my opinion, that Quick-Install is better, since it is easy to install and test, especially if the defaults satisfy you. Also, it is still a Helm Chart, so it is as flexible as the Helm Chart method.

For scalability problems, you can increase the number of replicas for your pods, like the rasa-x and rasa-production pods.

Service Role
rasa-x Running the HTTP API
rasa-production Running a trained model, parsing intents, predicting actions
rasa-worker Training and evaluating models

You can learn more in the Rasa Advanced Deployment Workshop.

I don’t know much about the Docker Compose method, but the same logic as the pods/services above probably apply. If there’s a lot of pressure on a container, try replicating it.

I haven’t used Rasa Entreprise, but what I know is that you get a custom quote depending on your needs. You can contact their sales team and ask them :slight_smile: .

Docker compose and quick install (rancher’s k3s sort of k8s) only run on a single host, independently of the container replicas.

If you don’t have enough capacity for all concurrent users on a single host/VM and you need to de/scalate fast, I would go for kubernetes cluster running in several hosts installation with helm charts and modifying first the values.yml file to increase container replicas as @ChrisRahme mentioned.