Serving Multiple Bots from Django Backend

Hey guys,

I’m currently trying to serve multiple bots (running different models) and to allow users to interact with it on a website. I’ve had a look at the following:, and, but I’m still having trouble figuring out how it can be done.

Some of the solutions I’ve considered are either:

  1. serve the bot on a HTTP server and have my website interact with the Rasa HTTP server
  2. Create the website on Django Framework or REST API, and run Rasa Core and NLU on the backend.

Which of these would you recommend? Are there better alternatives? And, could anyone please briefly explain how this can be done (with multiple bot models and instances running)?

Any help would be greatly appreciated!


After a bit more digging around, I’ve decided to go with the Django backend. Looks like it should be able to handle running the bots as well as connecting them to a web page.

Hello Raymond,

I have implemented this using Flask. What I did was create my own server and not using the Rasa HTTP server. My server has an “initiateConversation” endpoint where I create a unique ID. I have another endpoint that takes that id and a message. I then take that id and message and simply call

agent = Agent.load() //Whatever model you want to load

agent.send_message(text_message=message, sender_id=unique_id)

This will allow you to create one agent instance of a model, have multiplt conversations.

Regarding having more than one model , simply instantiate more than one agent that loads different models:

agent1 = Agent.load(model1, interpreter1)

agent2 = Agent.load(model2, interpreter2)

Then have your endpoints figure out which agent should handle the message and which conversation it is on. Hope this helps.


Thanks for the reply! I’m starting to work on something similar using Django (chose Django instead of Flask mainly for the ORM database support). For anyone else searching, these answers here give an idea of how it can work:

Hey @adrianhumphrey111, just a question regarding flask, is it able to handle multiple requests to multiple agents simultaneously? Just wondering as now I’m considering switching to flask, Django is a bit bulky and doesn’t seem to be able to handle multiple clients simultaneously.

Hey, how have you done that?

What I did was create a simple server running that takes all request through port 5000. This can run on your local host, in the cloud, or wherever you want to run the server. In the server file, I instantiated a agent instance.

agent = Agent.load()

Then I declared all of my app routes for flask. So every time you run


And start your server, you will create an agent instance loaded from your nlu model and your dialogue model. I then have a route on my server that takes get requests with the params

{ message: “Hello my name is Adrian”, sender_id: “alijf9424nn349348394n”}

Where the sender_id is a unique string. I returned the message by

return agent.handle_message(text_message=message, sender_id=sender_id)[0]

What this does is send that message to your agent, but by sending it a unique string for the sender_id, it will in essence create a new conversation for that sender Id, that you do not have to manage, just send that message for that sender id, and make sure whoever is sending the message has access to the sender id to pass it in a param.

Hope that helps.

So far for me it has, I have tested having two different conversations at one time and it handles it nice. Flask is very straight forward, for me at least. My logic is pretty simple as of now, and have not tested Flask with thousands of requests.

So you are using this in your custom webchannel? This does just handle the message without any channel if I am correct. In all webchannels I saw there is something like: UserMessage(text_message, output_channel, sender_id)). I am still figuring out what this does. Somehow this should send the message to the NLU such that it gets parsed.

Bit of a late reply, but I managed to get something working on Flask. Now just gotta see how many bots this can run simultaneously (this project is quite big and will require >100 different bots)

Hi raymond, are you still working on the project connecting django and rasa chatbot, I would be interested to work together. thanks!

Hey, yes I’m still working on it and have moved to flask and flask-socketio. It’s nearly ready for release! Did you need help with connecting a backend with rasa?

Hello, Happy New Year raymond :slight_smile:

Yes, thanks for asking, I’m trying to connect rasa bot with django by using this rasa starter pack tutorial in combination with django rasa bot.

All went ok, copying my models and running server runserver and (Socket IO server). However I cannot type my message inside of the bot’s window input field :frowning: I tested rasa starter pack alone, the bot replies to my requests, that was fine. Finally I see this when going to my localhost:8000

Hmm I’m not familiar with Django-Rasa-Bot, but from that screenshot it looks like the client isn’t connecting to the server. Have you checked that the django server and bot are both running properly? Does anything come up in your browser console when you press F12?

@raymond, a few months ago you said your flask and socketio implementations was nearly ready for release. Do you mind sharing some of your implementation details? I’m stuck thinking about how to serve multiple core models in a scalable way, and came across this thread.

Also kinda unrelated, but I’m a bit new to async/chat in python. Are you using sockets in your flask app, or just using flask as a kind of orchestration layer?

Hi Matthew,

Since then we’ve moved away from flask socketio and decided to use plain flask and http requests, since there were some issues with socketio not working over 4G networks. When we were using socketio, it was used in the flask app with the flask-socketio plugin. For scalability, in production we just ran the flask app using gunicorn on a Ubuntu server, and has been able to server multiple rasa core models simultaneously.

As for implementation, we basically just had several instances of Agent() running, each serving a different model. The incoming message request would specify which instance it’s aiming to communicate with, and Agent.handle_text gets called, and returns the message to the client.

Ah, so you had a fixed number of core models running, as opposed to creating and destroying them as the app ran?

Yes, had a fixed number which were started at the start of the flask app which was good enough performance wise for our purposes, although I’m thinking it might be worth looking into creating/destroying them as needed later on to see if it helps save memory and CPU.

1 Like

I was able to train and use 50 rasa nlu models with Vanilla Flask my specifying just routes. However, this requires a lot of ram to store all the models.

@raymond Hi Raymond! I know this post is a few years old now, I’m just starting to look into doing something similar to what you mentioned above. Just wanted to see how it worked out? Any suggestions on how I should go about doing this now that you’ve done it?

1 Like