For local testing it makes sense to have everything in a single Rasa NLU Python package, but for going live I don’t really see how this is optimal.
When using a Docker / Kubernetes setup, I would expect the images to be specific for a task and of a file size around 50-100 MB. To be honest I was expecting a shift towards more separate packages, like: Rasa Core train package (with only dependencies to train a core model), Rasa Core run package (with only dependencies to run an HTTP server and core classifier), etc.
If you have to ramp up replicas of a 5GB Docker container, or update them in your cloud, it is very prone to failure.
Would there be a possibility to have more segregated docker images?
Very good point and I think there could be some benefits to a “serve this model” only container. The question really is how much of a benefit this is compared to the additional complexity (e.g. a lot of the bulky dependencies need to be installed in the “serve” container as well (e.g. numpy, tensorflow)).
Regarding the merge: the containers will keep their size (base Rasa image is ~350mb, which is comparable to the base image for NLU and Core - it is not like their sizes add up, they share most of their dependencies anyways). Most of the size will be due to installed language models.
Hi All!
How does the merge affect the docker image:label to train a core model or the command below in particular? I’m following this tutorial Building Rasa with Docker and the image being used for training a core model is ‘rasa/rasa_core:latest’