[Ask] Can I train md data separately, but resulting in one single model ( hardware limitation )


(kenzy dario ) #1

Greetings, I want to ask if it’s possible to split and train a large md data into 2 or more file and then combine it to a single model, because right now, I’m trying to train a md file and got memory error, currently there are around 25000 usersays and 200 intents and only a few entities. I’ve tried 32 GB of RAM on cloud PC and it still return memory error, but when I reduce the data to around 60% it can barely stands it. any solution? thanks in advance.

(Adi Rizka) #2

I think rasa can do this thing somehow, because if the train data become too large, it becomes scalability problem for dev. @akelad @tmbo @Juste

(Juste) #3

Hey @kenzydario. As I understand, you run into memory issues when training the NLU model, right?

(kenzy dario ) #4

yes thats right, happy valentine by the way. @Juste

(Juste) #5

@kenzydario Oh, I can see that there is an issue on this on GitHub as well: MemoryError with tensorflow_embedding on ~73k dataset with 38 intents · Issue #1621 · RasaHQ/rasa_nlu · GitHub. We are looking into what is the possible cause of this issue and will get back to it on GitHub.

A quick and very naive solution in a meantime would be to cut out the size of training data.

(kenzy dario ) #6

yeah but is it possible if you combine 2 different model to one model? like separate training but in the end just combine the model? @Juste