How to improve the response time of rasa bot

Using rasa as python library I am loading two agents (with different models, with around 60 intents), Both models are in memory.

If i hit single request it takes between 2-4 seconds, now from JMeter if i hit 100 requests in parallel with 10 sec ramp-up period then it response time starts from 2 sec to 120sec and the average is 60 sec.

What can be the general solutions that can be applied to reduce the response time ? any suggestions while training model? even for the single request it is taking 2-4 seconds.

Have do any performance analysis?
Do the network cost much or any others ?

yes, with JMeter, tested the performance testing, for 100 requests avg rasa response time is 60seconds. even for 50 users it is taking 56 seconds.

You can try the profiling tool, like: Pyinstrument
I think you need to know which steps make it be slow, and to improve performance for the step.

ok will try thanks!!! any other general solutions to improve the performance??

A lot, which step cost time. To optimize it.
For model, you can compress it or change model.
For calculate, you can use rust for multi thread.
etc…