Rasa is not able to handle more than 4/5 concurrent requests

In a thread of mine I’ve looked at profiling the performance of Rasa: Performance of a Production bot

The performance does depend on which tracker store you choose, as well as the size of the current conversation. So what tracker store are you using? Also, what resources are you allocating to the various components?

4/5 does seem very low, although that is concurrent requests. What is the response time and throughput that you’re seeing before things start to slow down?

Are you only running one instance of Rasa? A possible solution could be to run multiple instances, possible across multiple machines, with a load balancer in front of it.

Another place to look might be the actions server, if you have any custom actions, it might be worth profiling and seeing if that is where the bottleneck is.