Multiple Rasa bots on one port

I have a company and I want to run 10 chatbots which have different models(say one for a hotel, other for a gym and another a weather bot, and so on) at a time which interacts using Rasa using REST APIs, after searching for a lot I was able to find these docs Using Rasa NLU as a HTTP server — Rasa NLU 0.12.0 documentation which solves my problem very well but after further research, I came to know that projects have been removed from RASA and was able to find two solutions,

  1. Run each bot on a separate port which I think can slow down the server
  2. Load and unload model each time a request is made which will make the responses very slow

I wanted to have a solution where I could run my server on one port and route my requests to various models somehow, there are many questions asking the same problem but nowhere I could find a straight forward answer to the problem so please let me know if it is possible or not? Thanks in advance and please let me know if any more info is required.

HI @shadow_ranger. Are you using Docker to run your bots? Overall, for completely separate assistants you should use different ports to make sure that the messages are being sent to the right assistant.

@Juste Thanks for the reply. Yes, I am using docker, but say if I have 10 bots then opening so many ports will increase the load on the system isn’t it? so is there some way to scale multiple assistants on one server using Kubernetes? I am not aware much about Kubernetes but if you say that is the way to go then I will look into it. I am looking for a solution wherein I can host 10 bots on a server without affecting the performance. Thanks

Hi @shadow_ranger. Is there a specific reason why you want to run all of the bots on one server? Our general recommendation is to use one server for one bot, in that case you wouldn’t even have to worry about the ports. One thing you have to think about is that your server will need more resources to scale your assistant as it grows

1 Like

Hey, have you found any solution, on how to serve multiple bots under single port?

Hi @Juste . Is it possible now to run multiple bots in one port? multiple bots as separate instance will increase the load and it consumes lot of resources. The main consuming library is Tensorflow. I have 8 bots in my server which consumed 7.5GB of RAM which is not good.

Techincally you can use a proxy like trafeik or nginx and route requests based on endpoints thus exposing port 80 and using routes point to the correct bot endpoint running individual ports.

if your individual ports aren’t accessible on the internet, then you are only exposing one port via Nginx for example and running multiple rasa server with each their own model. All of these rasa services can share the same virtual environment thus reducing the number of install you would need to do on your server.

Another alternative is a Nice orchestration engine like Kubernetes.

@souvikg10 in your approach we may use single virtual environment for multiple bots but in the end multiple bots will consume lot of RAM, I need a solution where single rasa instance should be able to handle multiple bots. I have 8 bots which consumes 8GB of RAM.

I don’t think you will be able to have an impact on RAM even if you run one application or multiple since when when you unwrap the default config trained model of rasa, it consumes close to 698Mb of Memory. So with 8, you have reached close to 6.5Gb, add on top your applicative framework which exposes an endpoint you will still reach close to 8GB

Almost forgot to add that tensorflow which is installed by default also has a big memory footprint. As rasa is designed to run a single bot instance and ofcourse you can Python wrap the framework and make multiple bots work on a single instance but the memory footprint won’t really have an impact. Use an orchestration engine like k8s to manage multiple bots in tandem

Unless you find a way to compress the model to a much smaller size, this estimation won’t really change drastically much

1 Like