Is rasa server running in multi-threaded way?

@YoungXu06, I can pretty much guarantee you’re never going to reach 400 requests per second with that type of user base. If you think about it - 20 requests per second equates to roughly ~250 concurrent users (making some assumptions around how frequently they’re sending messages). We have customers deployed to millions of people, and they’re nowhere near needing to handle 400 requests per second.

I’d suggest deploying the bot to a part of your user base first, to judge how much volume you’ll be getting, and then decide how many rasa pods you need from there. Autoscaling in kubernetes might also be something to consider

2 Likes