Rasa is not able to handle more than 4/5 concurrent requests

while making socket.io call from angular ui to the HTTPS url exposed by RASA, rasa is not able to handle more than 4/5 concurrent requests. Getting responses in delay or some time response not at all coming. And socket.io connection is giving error.

Both UI and rasa are deployed in docker. What may be issue here?

1 Like

In a thread of mine I’ve looked at profiling the performance of Rasa: Performance of a Production bot

The performance does depend on which tracker store you choose, as well as the size of the current conversation. So what tracker store are you using? Also, what resources are you allocating to the various components?

4/5 does seem very low, although that is concurrent requests. What is the response time and throughput that you’re seeing before things start to slow down?

Are you only running one instance of Rasa? A possible solution could be to run multiple instances, possible across multiple machines, with a load balancer in front of it.

Another place to look might be the actions server, if you have any custom actions, it might be worth profiling and seeing if that is where the bottleneck is.

Hi @rudi
I have a similar problem.
I have one instance of Rasa server and one instance of Rasa actions server
Suppose two users are there - u1 , u2 and they queried q1, q2 simultaneously
I can find logs of queries q1 ,q2 in rasa server but the actions server acts sequentially first q1 then q2 if q1 reached first I have currently 50 users but it will be approx 2000 users .
Tell me how I can process the custom actions parallelly for separate users and not sequentially thanks

1 Like

@vi.kumar What is your actions server doing?

From my experience, the actions server is async, so it should be able to process multiple users in parallel, provided that you’re not blocking the event loop.

The other option is to run multiple instances of the actions server behind a load balancer, but that should only be necessary if your actions server is using 100% CPU, provided that it’s not doing any long-running blocking operations.

Hi @rudi
I also thought rasa actions server should be running requests in parallel.
But what i am finding is different

Can you give me some insights why this is happening

1 Like

@rudi
My rasa model server has these properties:
2020-08-26 17:39:46 DEBUG rasa.core.tracker_store - Attempting to connect to database via ‘sqlite://:***@/rasa_server.db’.
2020-08-26 17:39:46 DEBUG rasa.core.tracker_store - Connection to SQL database ‘rasa_server.db’ successful.
2020-08-26 17:39:46 DEBUG rasa.core.tracker_store - Connected to SQLTrackerStore.
2020-08-26 17:39:46 DEBUG rasa.core.lock_store - Connected to lock store ‘InMemoryLockStore’.\

For two queries : i am giving you logs of rasa server and actions server.
Logs of rasa Model server -
Here u can see there are two users with sender_id {sbm.kumar , vi.kumar}
And two Queries - “Find Issues” and “Prediction Issues”
sbm.kumar request for “Prediction Issues” functionality which comes at 17:52:50
vi.kumar request for “Find Issues” functionality which comes at 17:52:53\

2020-08-26 17:52:50 DEBUG rasa.core.tracker_store - Recreating tracker from sender id ‘sbm.kumar’
2020-08-26 17:52:50 DEBUG rasa.core.processor - Received user message ‘/ask_predict_issues_part’
2020-08-26 17:52:50 DEBUG rasa.core.actions.action - Calling action endpoint to run action ‘action_predict_issues_part’.

2020-08-26 17:52:53 DEBUG rasa.core.tracker_store - Recreating tracker from sender id ‘vi.kumar’
2020-08-26 17:52:53 DEBUG rasa.core.processor - Received user message ‘/ask_find_issues_part’
2020-08-26 17:52:53 DEBUG rasa.core.actions.action - Calling action endpoint to run action ‘action_find_issues_part’.\

Logs of rasa actions server -
Here , at 17:52:39 - “Prediction Issues” custom action runs and the actions server prints nothing until this action completes.
“api for predicting issues is called.” this api takes time .
Until this api is completed there is no log of “Find Issues” custom action.
U can check this from timestamp.
Once “action_predict_issues_part” completes then at 17:53:52 “action_find_issues_part” starts\

[2020-08-26 17:52:39] “POST /webhook HTTP/1.1” 200 1842 0.019000
action_predict_issues_part
api for predicting issues is called.
action_completed\

[2020-08-26 17:53:52] “POST /webhook HTTP/1.1” 200 12050 62.046102
action_find_issues_part
action_completed\

My Observations-
The behaviour of Rasa Model server is Async and Parallel ,
But behaviour of Rasa Actions server to execute custom actions is sequential

Kindly tell me what i can do or am i missing anything to make you understand my query.

I’m assuming that this is an external API that you’re calling over HTTP, or something like that?

How are you calling this API, are you using a library?

Does your custom action start with async def, and does the call to the API have an await in it? If not, then you might be making a blocking call to the API, which means that while you’re calling the API you’re blocking the rasa actions event loop, and so it cannot process any other requests until that call is finished, it just has to wait around and be blocked.

Hi @rudi
Deatils for your query “How are you calling this API, are you using a library?”-
Format of my custom action -\

class ActionPredictIssues(Action):
def name(self):
return ‘action_predict_issues’
def run(self, dispatcher, tracker, domain):
print(“action_predict_issues”)\

I have hosted a web server in django, there i receive incoming_msg, sender_id from the front-end and then i call the REST api of Rasa .
Call to RASA via Rest Api -\

def thread_rasa(incoming_msg,sender_id,rasa_responses):
now = datetime.now()
sender_id = str(time.time()).replace(".","")
d = { “sender”: sender_id,
“message”: incoming_msg,
}
data = json.dumps(d)
rasa_api_endpint = “http://localhost:5005/webhooks/rest/webhook”
r = requests.post(url = rasa_api_endpint, data = data)
rasa_responses = r.json()
t = threading.Thread(target = thread_rasa, args = (incoming_msg,sender_id,rasa_responses))
t.start() \

kindly tell me
Thanks

I’m assuming that is in your actions server code.

So, it looks like you’re using python requests, which is a sync library, so it will block your entire actions server until that request is finished. I would suggest using an async library instead, something like GitHub - aio-libs/aiohttp: Asynchronous HTTP client/server framework for asyncio and Python or GitHub - encode/httpx: A next generation HTTP client for Python. 🦋

Then, you also mentioned that you’re using django to make HTTP calls. I’m assuming that you’re making those calls inside of the request, and not using a background task processor like celery. If that is the case, keep in mind that django is a sync framework (unless you’re using the new async things), so each django worker will only be able to process one request at a time. So you’ll need multiple workers, and load balance between them, if you want to be able to handle multiple requests at a time in your django app.

@rudi
Hello,
1.) I hosted the webserver in django and it was all synchronous intially. I read about it and changed to async . I upgraded my Django to 3.1.1 for async support -

  • made the django views async
  • In my view.py i am using all async functions .

I have tested with multiple users using a non-blocking sleep statement : await asyncio.sleep(10).
And i found logs of multiple users non synchronously.

  1. changed requests module which was sync-
    now i have used request_async
    import requests_async
    rasa_api_endpint = “http://localhost:5005/webhooks/rest/webhook”
    coroutine_ob= requests_async.post(url = rasa_api_endpint, data = data)
    res = await coroutine_ob

after all this ,I thought i would be able to achieve async behaviour of Rasa actions server but still i cant.

Snippet of actions.py is -

class ActionFindIssuesGroup(Action):

def name(self):
    return 'action_find_issues_group'

def run(self, dispatcher, tracker, domain):
    print('action_find_issues_group')
    user_input = tracker.get_slot('id')
    back = tracker.get_slot("back")
    plm_jira = tracker.get_slot("plm_jira")

So to check whether these are async or not .
i) I tried to pause it by writing - await asyncio.sleep(5) in custom action
it gave me an error - cant use await outside async function.

ii.) I made run and name functions async
then I got this in my rasa actions server -
Found a coroutine object at action_find_issues_group
and this function was not triggerred by rasa core on getting its intent.

My RASA version is 1.4.3
kindly tell me what else i need to do or where i am missing

In the rasa actions server, you need to make sure that the functions with async code in them are defined with async def, and that any async methods are called with await (or equivalent)

I noted that you made the name function async. I don’t think rasa expects this function to be async, so it will call it like a normal function, and then not be able to get a string as a result. That is probably why it’s not running your action, it doesn’t know what the name of that action is, so when Rasa Core asks to run action_find_issues_group, your actions server says “I don’t have any actions with that name”

Then how shall I achieve async in rasa actions server. Please explain me a little more

Hello I went through the source code in my python37\lib\site-packages\rasa_sdk\ folder -
See i have versions of RASA 1.4.3 and 1.10.1\

rasa 1.4.3 is installed in my local system,
while rasa 1.4.3 is installed in my production server .\

What i found is -
In rasa 1.10.1 , snippet of interfaces.py -
class Action:
“”“Next action to be taken in response to a dialogue state.”""

def name(self) -> Text:
    raise NotImplementedError("An action must implement a name")

async def run(
    self, dispatcher, tracker: Tracker, domain: Dict[Text, Any]
) -> List[Dict[Text, Any]]:
    """Execute the side effects of this action.

    Args:
        dispatcher:
        tracker: 
        domain: the bot's domain
    Returns:
    raise NotImplementedError("An action must implement its run method")

def __str__(self) -> Text:
    return f"Action('{self.name()}')"

And in rasa 1.10.1 , snippet of interfaces.py -
class Action:
“”“Next action to be taken in response to a dialogue state.”""

def name(self) -> Text:
    raise NotImplementedError("An action must implement a name")

def run(
    self, dispatcher, tracker: Tracker, domain: Dict[Text, Any]
) -> List[Dict[Text, Any]]:
    """Execute the side effects of this action.

    Args:
        dispatcher:
        tracker:
        domain:
    Returns:

    raise NotImplementedError("An action must implement its run method")

def __str__(self) -> Text:
    return "Action('{}')".format(self.name())

dos it mean that rasa 1.4.3 doesnt supoort async by default. becasue in 1.4.3 the functions of Action class are not written with async keyword unlike in rasa 1.10.1
i am upgrading my rasa in my production server to 1.10.1 and i will try and will update u

@vi.kumar Yes that is true, but not because of the reason you listed. The run function can be either sync or async.

In 1.4.0 (I couldn’t find a tag for 1.4.3), the rasa SDK doesn’t use await to call the executor, so there isn’t a way for it to be async: rasa-sdk/endpoint.py at 1.4.0 · RasaHQ/rasa-sdk · GitHub

In 1.10.1, the rasa SDK uses await to call the executor: rasa-sdk/endpoint.py at 1.10.1 · RasaHQ/rasa-sdk · GitHub , and then in the executor, if it’s an async function, it calls await on it, otherwise it runs it: rasa-sdk/executor.py at 1.10.1 · RasaHQ/rasa-sdk · GitHub . So if your function is defined using async def, it will be called asynchronously.

yeah i will update it to the latest version and will try it i hope i get the async which i am trying for so long to achieve thanks for your valuable help i will ping u if i faced any error