We are observing something strange while doing load test that, even for just 50 concurrent requests, the avg. response time is 2s(without action service/any DB operation). And for 100 requests, it takes 5s. Is that a normal behaviour in Rasa or we have concurrency issue here. I know Rasa suggests Kubernetes to address scalability issue. But the cluster setup will be effective for huge number of users and not for just 50 or 100 concurrent users(just 30% of CPU in the server is utilized). My current setup follows:
- Using a custom connector (load test with just rest endpoint also gives same response time. So bottleneck cannot be here.)
- Using mongo as a tracker store. (tried redis as well which results in same response time.) I’m attaching the load test results herewith.load-test-results.xlsx (8.8 KB)
I’d like to know if this is the expected behaviour in Rasa or is there any other way we can improve the response time for concurrent users?