I’ve start working on Rasa NLU for a project . My use case requires me to keep updating my training set by adding new examples of text corpus entities. However, this means that I have to keep retraining my model every few days, thereby taking more time for the same owing to increased training set size.
Is there a way in Rasa NLU to update an already trained model by only training it with the new training set data instead of retraining the entire model again using the entire previous training data set and the new training data set?
I’m trying to look for an approach where I can simply update my existing trained model by training it with incremental additional training data set every few days.
everytime you train a model in Rasa we create a new .tar file in the /models folder. This is the model that gets picked up and this does unfortunately mean that you can only train on a batch of data, not on a stream.
A large part of this also has to do with our tensorflow backend. The neural network layers that are offered here are designed to work on a batch and not a stream. To my knowledge there also not a whole lot of machine learning models that work well on a stream. There’s some linear systems (like passive-agrressive models) that allow for streaming updates but neural networks, at least to my knowledge, cannot practically update this way.