How rasa handle concurrent request

Hi All,

Please assist me in understanding about the process i.e. how rasa handle concurrent request. I have one observation wrt rasa processing speed, if we are trying to send a single request then it is taking x sec and if we are trying the same query with 50 request at same time, then ideally it should take 50x sec but it is taking much more than that.

Please find the jMeter Observation Example below :

msg : ‘hi’: Response time single request = 0.4 sec For 50 req /sec avg time = 5.5 sec max time = 10 sec

Kindly help me in gaining better understanding on how rasa actually handles concurrent request .

Are you using Rasa locally or is it deployed on a server?

It is deployed in a server

You can increase the number of replicas for your pods, like the rasa-x and rasa-production pods.

Service Role
rasa-x Running the HTTP API
rasa-production Running a trained model, parsing intents, predicting actions
rasa-worker Training and evaluating models

You can learn more in the Rasa Advanced Deployment Workshop.

1 Like

Please may I know how exactly rasa is processing (backend processing) the request in open source rasa framework.

Any updates ? Please guide me with the better understanding.