How long is too long for training?

Hi all,

My current bot trains in about 5 minutes 30 seconds. Is that long?

I have 34 stories, and LOADS of checkpoints (all stories call back to a main checkpoint where I can divert again to all stories based on user input). My NLU file is 1414 lines long covering 27 intents.

How long is too long? What’s realistic for a production-ready bot?

Thanks!

The span of time needed for training depends a lot on the policy you chose, as well as on some parameters like --augmentation for rasa train. It also depends on your featurization (more features means more work), and, of course, on the machine that you are training on. I can typically train Rasa Core on a modern MacBook Pro within two hours or so, but I also have a lot more stories and probably different settings than you. Choosing another policy can get the time down to 10 minutes, but the result is not as good. I have no intuition for NLU training, though.

I think the number of checkpoints should be kept low if possible, but I wouldn’t worry if you just train for 5 minutes.

Many thanks for your response @j.mosig. Since you mentioned that yours trains within 2 hours, I’m very curious to find out how you manage to continuously develop and test your Rasa bot with such a long training time.

If anyone else from the community experiences the above then I would love to hear from you!

Thanks

I am not sure if 2 hours is close to a median here, but it’s actually not that bad. First, there are a couple of things that you can do which don’t require you to retrain all the time. E.g. changing custom action implementation. Second, you often only need to retrain either NLU or Core. And finally, I think normally you collect all the issues and do all the story / NLU training data improvements, which can take a lot of time. Then actually training for the next iteration of your bot is not that significant. But I also would like to hear from others about this.

We also have a training time of about 1-2 hours depending on the machine. One way of speeding it up for continuous improvement is to split the stories with checkpoint by files. During development, you can remove some of these files outside of the data folder and only train on the part of the workflow that you’re interested in. Once it’s working fine, you can put back the other files and do a full retrain.

Rasa Core training might become significantly faster with our new TED policy: http://arxiv.org/abs/1910.00486 (should be the new default in the latest version)

Hi Johannes

How can I implement the new TED policy? From the Rasa documentation on policies (Policies) I can’t really tell what the config.yml file has to look like / how the TED policy has to be configured.

Thank you very much for your support!

Hi @luc-leduc, welcome to the Rasa forum!

TED is the new default and part of the EmbeddingPolicy. An example configuration could be:

policies:
- name: MemoizationPolicy
- name: EmbeddingPolicy
  max_history: 5
  batch_strategy: balanced
  epochs: 50
  random_seed: 1234
  evaluate_on_num_examples: 0
- name: MappingPolicy
- name: FormPolicy