I am using rasa 2.1.2 to run my model. When making multiple simultaneous http requests to http://localhost:5005/webhooks/rest/webhook, I am receiving None as response. However, when consecutive requests are sent, I get the desired responses. Is there any configuration that allows simultaneous connections to my rasa server. Is there any maximum number to this or any number of requests are allowed ?
Also, is there any configuration that allows getting multiple responses from agent simultaneously using the SDK.
What I want to achieve here: 1000 (or more if possible) concurrent users using a bot.
I am using rasa open source and have deployed a rasa trained model locally (http://localhost:5005/webhooks/rest/webhook). I am load testing it to check number of concurrent users the server can handle. I started with 5 users till 50 with 12 sequential requests per user. I get no response from server when number of users are large (failures are random. sometime there are failures and other times there are none for 10 users(more failures on larger number of users)). The response time is also large (for 5 users, it is around 2secs and increases with number of users). There are no errors printed in logs whatsoever when i run this in debug mode.
I did an experiment. I created multiple processes and shared an agent created using rasa sdk among these processes. I get response within 1sec even for 100 concurrent users.
I did another experiment where I started rasa agent within an http server and started hitting that with concurrent users with 12 requests per user. Again, when number of users are more, i get no response for some of my requests. I believe some requests are not processed by the agent/server and dropped. How do I know? I printed the responses given by this server and they are equal to the requests for which i got a response successfully. There are failed requests which did not even reach my agent.
So, whatever this issue is, it is there on rasa http server too.