I’m working on integrating Rasa into our chatbot training tool. I’m currently training models using the python components of Rasa. As a storage sollution, since model training is intended to be a microservice, I don’t store the model on the local file system, but zip the files and load the zip into a database. Now my question is, assuming I have an independent Rasa NLU HTTP server running on a different machine, is there a way I can load / deploy a zipped model using the HTTP API?
This is what we do for deploying our models
we build a docker image that includes the NLU parse API, dependencies and the model(language + nlu) and deploy the docker image into a container and launch the service using the pre_load in order to speed up the parsing when the API is deployed instead of pushing a model on the fly into a running http server.
You can do A/B deployments to control which part of traffic you would expose your latest models to since it is not guaranteed that it will perform better than your previous one and lets you rollback easily. I have tested this pattern and found it safer
To answer your question, i think the load model endpoint has been deprecated in the latest version, you can only Delete models
Thanks for the respone. We intend to have really swift train&deploy routines to support features like live models during labelling and active learning. On-the-fly building and deploying of docker images seems to me quite slow and clunky, especially if the instructions for that would come from within already deployed docker containers. Would it be an option to instead have a separate instance where we run a python application which loads a model using the rasa nlu components and hosts a REST API with the same /parse endpoint as the Rasa HTTP server? With a custom api we could add endpoints which load models from a supplied zip, or alternatively grab them directly from the database.