I would like to know what extra tools do you guys recommend to use together with rasa in order to support more than 200,000 requests per month.
I have just developed some projects combining rasa + flask + ngrok, however it seems not ideal to a large project.
I heard about Amazon Elastic Beanstalk. Can you guys share some combinations to make Rasa scalable, and mention some bottlenecks that I should worry about ?
Hi @kaleming, have you taken a look at Rasa X yet? There’s a bunch of deployment options in there that are production ready. For a scalable deployment, I would probably suggest the Kubernetes deployment option. That will handle 200,000 requests per month no problem, and you can always scale it if you get more requests.
I have successfully deployed Rasa X to Kubernetes and am running into scalability issues. The bot works fine if we have a couple of users, but when we have more than 10-20 users the bot really struggles with 20s+ response times. I have checked the pods and cpu utilization and cannot pinpoint the bottleneck. Also the only concerns I have seen online were regarding the tracker store, however I assumed that the Kubernetes set up was going to be scalable out of the box. BTW I am using the AWS EKS set up. Any ideas would be much appreciated.