DIETClassifier: Slow training (like 10 hours)

Hi all. I want to train DIETClassifier, but it trains so slow… How can I make it faster? Is it because some intents have many examples? (i.e. 110.000 examples)?

Did you write a script to generate 110,000 examples? Is that much data actually needed to get the accuracies you desire?

I’ve tried writing data generators, and found having a ton of NLU data doesn’t help much, you can get good performance with far less data.

DIETClassifier was taking about 2 hours for me. I switch to a GPU and training time went down to 5-minutes. I know GPU time is expensive, but what’s the cost of a developer being down for 10-hours?

Well, I forgot to mention that it is 10 hours on the TITAN RTX GPU (well one of the fastest around).

Yes, the 110.000 examples were generated by a script, using a few templates (with slots) and some values (~40 items) from 3 files that have values for the slots. Generating all possible permutations is really too much for rasa to handle (the nlu.md file becomes > 5GB).

Initially I though it was enough to give some examples from each entity (i.e. time), and compensate with a lookup file, like this:

lookup:time

data/lookup/time.txt

intent:inform

But it does not seem to work. Is there a way to make rasa combine the lexicon with the examples?