We recently started the performance testing of our chatbot and the results are not good. We are getting around 2 - 4 tps.
Following is the details: Rasa Version : 2.3.3 Rasa SDK Version : 2.3.1 Rasa X Version : None Python Version : 3.8.5
I have also kept the workers configuration: ACTION_SERVER_SANIC_WORKERS=20 SANIC_WORKERS=20
The deployment is having below limits: Limit: CPU: 3 Memory: 4Gi
Requests: CPU: 1 Memory: 2Gi
The average response time taken by the backend to respond averages around 100ms.
Also, there is a single pod with single container running Core and Action together.
Let me know if I have to look into something else or need to update any configuration.