Rasa HTTP Server Performance Problem

shota · July 11, 2019, 4:37pm

Hi,

We have Rasa v1.1.5 + duckling, running on AWS with HTTP API enabled. If there are no/light requests, it takes around 200 milliseconds, to return a response, but when there’s a load (say 50 parallel requests) the response time increases dramatically and takes 6 seconds on average.

I’ve seen somewhere that in the new version of Rasa, there’s a new HTTP server, which works better with the parallel processing, but comparing with my other server, which runs rasa 0.14, there’s no big difference.

Can someone advise how can I tune the rasa HTTP server to get better performance?

P.S. currently I’m running several instances behind a load balancer and that solves the problem, but I believe serving 50-100 simultaneous requests should not be a problem for single server too.

akelad · July 13, 2019, 5:15pm

50-100 simultaneous requests should be fine. How big is your model?

shota · July 13, 2019, 10:06pm

Trained model is around 8MB and ~ 150 intents

Topic		Replies	Views
How rasa handle concurrent request Feedback on Rasa Open Source	5	567	August 18, 2021
How to improve the response time of rasa bot Rasa Open Source	5	1433	August 23, 2021
Very slow http response for model parsing Rasa Open Source	2	455	May 31, 2020
My rasa model is unusually slow Rasa Open Source	7	1292	July 22, 2022
How many request rasa server can process at one time Rasa Open Source	2	891	September 2, 2021

Rasa HTTP Server Performance Problem

Related topics