Improve response time of rasa

mac_71128 · March 7, 2022, 3:34pm

I’m using Docker Compose Installation of rasa-x 0.38.1 and rasa 2.4.3. I tested the response time of my bot with JMeter. I sent multiple “help” requests in parallel with 1 second ramp-up period. I ran my bot on three different machines for this test. You can see the results in the following three tables:

8 core + 8GB Ram
image2144×312 67.6 KB
12 core + 16GB Ram
image2120×306 68.9 KB
96 core + 252GB Ram
image2148×314 68 KB

It seems rasa is not able to handle concurrent users very well. As the number of simultaneous users grows, the performance decreases. I thought that the bottleneck of the NLU component might be to blame for the problem. Now, I ran these tests again, but this time I directly sent the “help” message to the rasa-worker container. You can see the results in the following three tables:

8 core + 8GB Ram
image1658×318 48.5 KB
12 core + 16GB Ram
image1658×318 48.3 KB
96 core + 252GB Ram
image1650×316 48.1 KB

Given these results, it seems that the NLU component is not entirely to blame. What do you think is the reason? Is there a way to improve response time?

It also seems that there is no caching for the NLU result. I am right? In all these experiments, the active model was fixed and a fixed help message was sent. If there was a cache, after receiving each help message, only for the first message we have to apply NLU and use predicted intent for others. It may be that way, and I’m wrong.

Topic		Replies	Views
How to improve the response time of rasa bot Rasa Open Source	5	1428	August 23, 2021
Long response time Rasa Open Source	6	1068	February 26, 2024
Bot Response Time Increases Over Time Rasa Open Source	11	1210	December 6, 2023
Rasa becomes very slow for long conversations [Deprecated] Rasa X Community Edition	4	873	April 12, 2021
Rasa NLU slow response on localhost machine Rasa Open Source	1	932	May 28, 2019

Improve response time of rasa

Related topics