Bot Response Time Increases Over Time

I am currently testing the response time of a simple Rasa Bot.

My Rasa Model (v2.4) only contains 5 basic intents (“greet”, “goodbye”, “arrange_callback”, “cancel_callback”, “i_amabot”), an utterance and a story for each of these intents, but nothing else.

The experiments performed test the response time for the Webhook Interaction API (that messages the bot) and the Tracker Predict API, while 20 users are concurrently pinging the bot. I set up the experiments so that when any user sends their 20th message, their activity is finalised and a new user is created with a new user_id that will start pinging the bot instead. So at all times 20 users will always be pinging the bot for a maximum of 20 messages. Aditionally, there is a 3 second pause in-between the messages from the users, to mimic a more real life like interaction.

In the experiment below I have used the RedisTrackerStore and RedisLockStore (also tested the InMemory and Postgres SQL Tracker and Lock Store, but they weren’t much better). Here is the response time over a period of 24 hrs:

Time lapse (hours) Total messages sent to the bot Response Time (seconds)
1 22,457 1.07
6 103,093 3.90
12 175,378 5.64
24 309,143 9.13

I do not believe memory is the issue, as I am using an 8GB of memory for this experiment and only 0.7% of it is being utilised.

As you can see the response time increases as the Bot stores more messages from the users over time. I also noticed this in a smaller 10 minute experiment:

Time lapsed (min) Total messages sent to bot Response Time (seconds)
1 600 0.072
2 1036 0.064
3 2030 0.83
4 3074 0.103
5 4092 0.12
10 9181 0.20

I would like to overcome this issue as it isn’t sustainable. Does anyone know causes this increase in response time? What could I do to overcome this? My target is to have the response time be a constant value over time. Thank you.

(@Tobias_Wochinger, I noticed you answered somewhat similar questions before. I need some help please.)

1 Like

I think the reason is have too much event in tracker store. U can limit the event count by sender id or send a /restart message sometimes.

@anne576 you can have your Redis or Mongo tracker strore customized and can reduce the max_event_history to certain number.

This link might help you:

P.S. you can set state = tracker.current_state(EventVerbosity.AFTER_RESTART) which helps a lot when you restart your chat

Which model configuration are you using?

I think the reason is have too much event in tracker store . U can limit the event count by sender id or send a /restart message sometimes.

that should already be limited by @anne576 's 20 message limit :thinking:

To me it almost seems like some issue with the TrackerStore lookups but Redis should have O(1) for this operation.

How about running with --debug so that we can see from the logs where this time gets spent?

Thanks for your insights :raised_hands:

1 Like

Unfortunately, I could not find anything in the “–debug” logged info that could help me fix the issue.

I have also successfully created a Redis, InMemory and SQL custom tracker_stores that pass the max_event_history, only to notice conversation events are then not stored in memory and lost. This isn’t an ideal solution as the user inputs will not be logged anywhere, and in my case that information is quite crucial. Does anyone know a solution to counteract this?

If you want to record the user information, you can create a record user information api and call it when get a user in.
If you want to create a api, you need store the pre information when the function which limit the event history be called.
Max event can solve the much event for predict next action which maybe slow.

Do you happen to have any code I can see as an example please?

For first situation, in

class CustomInput(RestInput):
    def blueprint(
        self, on_new_message: Callable[[UserMessage], Awaitable[None]]
    ) -> Blueprint:
        custom_webhook = Blueprint(

        # noinspection PyUnusedLocal
        @custom_webhook.route("/", methods=["GET"])
        async def health(request: Request) -> HTTPResponse:
            return response.json({"status": "ok"})

        @custom_webhook.route("/webhook", methods=["POST"])
        async def receive(request: Request) -> HTTPResponse:
            sender_id = await self._extract_sender(request)
            text = self._extract_message(request)
            await"", data={"data": text, "type": "user"}). #  send record 
            should_use_stream = rasa.utils.endpoints.bool_arg(
                request, "stream", default=False
            input_channel = self._extract_input_channel(request)
            metadata = self.get_metadata(request)

            if should_use_stream:
                        on_new_message, text, sender_id, input_channel, metadata
                collector = CollectingOutputChannel()
                # noinspection PyBroadException
                    await on_new_message(
                except CancelledError:
                        f"Message handling timed out for " f"user message '{text}'."
                except Exception:
                        f"An exception occured while handling "
                        f"user message '{text}'."
                await"", data={"data": collector.message, "type": "bot"}).  #  send record 
                return response.json(collector.messages)

        return custom_webhook

Thank you for this suggestion. I do have a follow up question though. As I mentioned before, max_event_history limits the events that are stored in the tracker_store. If I were to create and call this api, will I be able to store the slots & event information provided by the user at any point, without max_event_history interfering? Thank you for taking the time to help me with this.

Do you mind explaining what the tracker.current_state(EventVerbosity.AFTER_RESTART) does? Where should I include this line of code? And if I have 20 users messaging the bot at the same time, how will the restart affect it throughout the conversation? (I’m sorry I don’t quite understand it/ have found any resources that explain it).

Sure you can save any information that you need at any time. I only need to know that when the information be generated.

@anne576 Can you tell me how did you track the response time?