Benchmark Rasa Performance

Hi ChiaJun,

I’ve done a small test to see how the server copes with stress and I don’t have good results to share. But regardless, here it is:

image

You can see here that the message response time per message increases with the number of messages that the user has sent. This is probably because of the tracker store and looks like it scales linearly with the size of conversation history.

For small conversation history of 10 messages, response time was as low as 60ms but it increases to 300ms for history of 700+ messages.

This test was done in JupyterLab so the tracker store used was InMemory. You can expect worse results when databases are used. (In fact, initially, I tested with a script and the session timed out for 800 messages for 2 users for a connection timeout time of 1 hour.) Also, almost all of the actions run on the action server which may have contributed to a little bit of time here.

So, to answer your question, it can probably support between 3-15 active users based on their conversation length.

Here is the sample code that was run:

import time
from rasa.core.agent import Agent

data = [
    # messages go here
    # had 848 messages here
]
data_counts = [i*10 for i in range(1, 80, 3)]

for message_count in data_counts:
    # create new agent to reset tracker store
    agent = Agent.load("models/model.tar.gz")
    
    start_time = time.time()
    for message in data[:message_count]:
        try:
            response = await agent.handle_message(message);
        except:
            pass
    end_time = time.time()

    print("message count:", message_count)
    print("total time:", end_time - start_time)
    print("time per message:", (end_time - start_time) / message_count)
    
    del agent

Hope that helps.

Edit: Looks like the performance deteriorates > 10x when using mongo tracker store in comparison to InMemory tracker store. My custom actions run ~3ms with InMemory and ~55ms for Mongo tracker store.

2 Likes