I started a rasa service on my computer. I used the jmeter tool for stress testing. I feel that the rasa service is not executing requests concurrently, but executing requests sequentially. It takes 50 milliseconds to simulate a user access, but if 5 users are simulated , Is about 260 milliseconds, if it is 20, about 1 second, as the number of users increases, the response time multiplies, if it is a concurrent processing request, it should not be multiplied, but I see the source code The asyncio asynchronous request framework is used. In theory, it should have a certain degree of concurrency. I don’t know where the problem is. Does anyone know, thank you
Hi, have you fixed this problem or do you have any solution? I faced the same problem with you
Rasa concurrency is not very good, it is officially said that each RASA service can handle 20-30 requests per second, but the actual testing does not seem to be that many, you can use multi-process or container deployment, and then use Nginx for load balancing.