Is there a way to launch a single RASA Core Server(Host Rest API) and access multiple dialogue\NLU models from the same RASA Core Server?.
I have gotten to this from the docs.
python -m rasa_core.run --enable_api -d C:/workfolder/luma-nlp/rasa-poc/project-models/aws-vm-stop-restart-terminate/dialogue -u C:/workfolder/luma-nlp/rasa-poc/project-models/aws-vm-stop-restart-terminate/default/model_20181018-165254 --endpoints endpoints.yml
But this launches a server specific to unique NLU and Dialogue model. But my requirement is to be able to launch a single RASA Core server and be able to access different NLU and dialogue models using HTTP Rest calls. Any help on this will help me get through something i’ve been trying for a week continuously. Even if you have any suggestions to go about solving this problem don’t hesitate to point me in that direction.
rasa_core.run starts up a new server and loads the model in memory to serve predictions. If you work with large models typically with deep learning that’s the case, you will run into out of memory issues. Frankly i don’t agree with the multi model approach on rasa nlu either. with large models typical with a vector set of around 500mb to 1gb, loading more than one of them in the memory of a single machine given how the cloud infrastructure is built will run up unneccessary cost. infact i would rather work with containers with limited memory and start as many rasa server as i would want.
What you are looking for i suppose is providing a hosted service for different business on your own infrastructure for their own bots, i would address it with a containerisation approach instead. You are more dynamic working pods than putting different models in memory of a single server
so if I put different projects in different containers, how do I serve that application in a single port of the server, I found some articles on HTTP reverse proxy methods using Nginx. Is that possible in my case?
I may have several projects and several models to serve