What is the recommended Setup for Production Deployemnt

DEVELByte · October 15, 2018, 12:46pm

I am trying to figure out what is the recommended way to deploy Rasa in production

is it a good idea to deploy rasa_nlu and rasa_core with APIs interface at two different servers, assuming I need duckling also for some entity resolution? or I go with a rasa_nlu and rasa_core together and one duckling server?

what is more scalable and fail-safe architecture? I learned that rasa have moved from Flask to Klein which Application server is recommended for this in production?

souvikg10 · October 15, 2018, 1:02pm

Rasa is using Flask as far i can see

Does your Rasa NLU has another purpose apart from the chatbots? - If you let’s say would like to do entity extraction from certain text or do email classification or tweet classification that usually are more NLP tasks then it is better to keep Rasa NLU separate in a server. Duckling is always running separately as a server

Do you have a more diverse ecosystem of multi chatbots ? - If you are deploying more than one chatbots , i would advice containerisation and technologies such as Kubernetes to really manage your deployment and resources efficiently

Do you have different model lifecycle between your Rasa NLU and Rasa Core?

DEVELByte · October 15, 2018, 4:24pm

Thanks for the response.

No its only used for Chatbot, I am new to Rasa, so i am trying to understand how to deploy it, basically what is the recommended way from the community.

Please elaborate what do you mean by different life cycles of models?? basically there should not be a dependency between NLU and Core since those two are independently deployable.

souvikg10 · October 15, 2018, 4:43pm

There are two ways to train the NLU classifier and the Core Classifier

If you train independently meaning with two different lifecycle - for eg one daily and one adhoc - you might need to deploy them in two separate server

This is mostly production.

Based on your case - you can bundle up your rasa chatbot in one server, however if you have custom actions - it should run on another server or a lambda function like a webhook

You can also externalise your template engine for managing bot responses better.

Ideally separate ML Inference logic from your functional code logic.

DEVELByte · October 16, 2018, 6:07am

Thanks, it’s very useful information.

I was referring to the this issue do this means Flask will be removed from future versions?

souvikg10 · October 16, 2018, 7:06am

The issue you are referring seems really old. For the moment, I noticed it is a flask server but not sure how’s that relevant to your question

If you have to use any other framework you can do so as well using Rasa’s python API and wrapping your framework around it for Http

souvikg10 · October 16, 2018, 7:10am

I will redact my above statement, it seems Rasa NLU is running on Klein. You are right

I was more referring to Rasa core

I also saw in the post a recommendation for containerisation which is really a good way to manage scaling. We scale with Kubernetes

Topic		Replies	Views
Rasa nlu and core separate container setup Rasa Open Source	3	379	February 19, 2021
Deploying BOT onto docker Rasa Open Source	6	2746	January 10, 2019
Rasa nlu server is not called by rasa core Rasa Open Source	10	1193	October 23, 2019
Running separate core and nlu Rasa Open Source	6	1711	April 9, 2021
Deploying multiple Rasa NLU models Rasa Open Source	8	1417	October 31, 2022

What is the recommended Setup for Production Deployemnt

Related topics