NLU training takes a long time

I have about 1000 examples and 25 intents in nlu file. In which the number of examples containing entity is 710 (most examples only have 1 entity). It takes me about 30-40 minutes to complete a training set without gpu (and takes 6 minutes when test in GG Colab with Tesla T4). It takes quite a while. Is it because my data is too much or the way I choose the pipeline. Here is my pipeline:

language: vi

pipeline:
  - name: "WhitespaceTokenizer"
  - name: "RegexFeaturizer"
  - name: "CRFEntityExtractor"
  - name: "LexicalSyntacticFeaturizer"
  - name: "CountVectorsFeaturizer"
  - name: "CountVectorsFeaturizer" 
    analyzer: "char_wb" 
    min_ngram: 1 
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: "EntitySynonymMapper"

policies:
  - name: TEDPolicy
    max_history: 5
    epochs: 100
    state_featurizer:
      - name: FullDialogueTrackerFeaturizer
    attn_shift_range: 2
    embed_dim: 20
    constrain_similarities: True
  - name: FallbackPolicy
    core_threshold: 0.5
    nlu_threshold: 0.4
  - name: FormPolicy
  - name: MappingPolicy

Does anyone know where the problem is?

UPDATE In version 1.8.0 with the below pipeline, training time is very fast. But apparently, the accuracy has decreased from 0.99 to 0.92

language: vi

pipeline:
- name: "WhitespaceTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"

policies:
  - name: FallbackPolicy
  - name: MemoizationPolicy
  - name: FormPolicy
  - name: MappingPolicy
  - name: KerasPolicy
    state_featurizer:
      - name: FullDialogueTrackerFeaturizer
    attn_shift_range: 2
    embed_dim: 20
    epochs: 150

I think 30-40 minutes is reasonable.

It takes 20-25 minutes with me. I also have 1000 examples. Try reducing the number of epochs for DIET, Response Selector, TED, and Keras Policy. You can use Tensorboard to find the optimal number of epochs.

This is my pipeline:

language: en_core_web_md
pipeline:
- name: SpacyNLP
- name: SpacyTokenizer
- name: SpacyFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: RegexFeaturizer
  case_sensitive: false
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: DIETClassifier
  epochs: 141
  model_confidence: linear_norm
  loss_type: cross_entropy
  constrain_similarities: true
  evaluate_on_number_of_examples: 200
  evaluate_every_number_of_epochs: 5
  tensorboard_log_directory: ./.tensorboard
  tensorboard_log_level: epoch
  checkpoint_model: True
- name: RegexEntityExtractor
  case_sensitive: False
  use_lookup_tables: True
- name: EntitySynonymMapper
- name: ResponseSelector
  epochs: 21
  model_confidence: linear_norm
  loss_type: cross_entropy
  constrain_similarities: true
  evaluate_on_number_of_examples: 5
  evaluate_every_number_of_epochs: 1
  tensorboard_log_directory: ./.tensorboard
  tensorboard_log_level: epoch
- name: FallbackClassifier
  threshold: 0.2
  ambiguity_threshold: 0.05

policies:
- name: AugmentedMemoizationPolicy
  max_history: 8
- name: TEDPolicy
  max_history: 8
  epochs: 41
  model_confidence: linear_norm
  loss_type: cross_entropy
  constrain_similarities: true
  evaluate_on_number_of_examples: 200
  evaluate_every_number_of_epochs: 5
  tensorboard_log_directory: ./.tensorboard
  tensorboard_log_level: epoch
- name: RulePolicy
  core_fallback_threshold: 0.2
  core_fallback_action_name: action_default_fallback
  enable_fallback_prediction: true
  restrict_rules: true
  check_for_contradictions: true

I’m on Rasa 2.3.x.

1 Like

I must admit that DIETClassifier does better than CRFEntityExtractor & EmbeddingIntentClassifier. But in my case, 30 minutes was too long. Maybe I should accept using a lower version with CRFEntityExtractor & EmbeddingIntentClassifier. Thanks for your suggestion. I can agree with it as a workaround

1 Like