I have about 1000 examples and 25 intents in nlu file. In which the number of examples containing entity is 710 (most examples only have 1 entity). It takes me about 30-40 minutes to complete a training set without gpu (and takes 6 minutes when test in GG Colab with Tesla T4). It takes quite a while. Is it because my data is too much or the way I choose the pipeline.
Here is my pipeline:
language: vi
pipeline:
- name: "WhitespaceTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "LexicalSyntacticFeaturizer"
- name: "CountVectorsFeaturizer"
- name: "CountVectorsFeaturizer"
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 100
- name: "EntitySynonymMapper"
policies:
- name: TEDPolicy
max_history: 5
epochs: 100
state_featurizer:
- name: FullDialogueTrackerFeaturizer
attn_shift_range: 2
embed_dim: 20
constrain_similarities: True
- name: FallbackPolicy
core_threshold: 0.5
nlu_threshold: 0.4
- name: FormPolicy
- name: MappingPolicy
Does anyone know where the problem is?
UPDATE
In version 1.8.0 with the below pipeline, training time is very fast. But apparently, the accuracy has decreased from 0.99 to 0.92
language: vi
pipeline:
- name: "WhitespaceTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"
policies:
- name: FallbackPolicy
- name: MemoizationPolicy
- name: FormPolicy
- name: MappingPolicy
- name: KerasPolicy
state_featurizer:
- name: FullDialogueTrackerFeaturizer
attn_shift_range: 2
embed_dim: 20
epochs: 150
I think 30-40 minutes is reasonable.
It takes 20-25 minutes with me. I also have 1000 examples. Try reducing the number of epochs for DIET, Response Selector, TED, and Keras Policy. You can use Tensorboard to find the optimal number of epochs.
This is my pipeline:
language: en_core_web_md
pipeline:
- name: SpacyNLP
- name: SpacyTokenizer
- name: SpacyFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: RegexFeaturizer
case_sensitive: false
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: char_wb
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 141
model_confidence: linear_norm
loss_type: cross_entropy
constrain_similarities: true
evaluate_on_number_of_examples: 200
evaluate_every_number_of_epochs: 5
tensorboard_log_directory: ./.tensorboard
tensorboard_log_level: epoch
checkpoint_model: True
- name: RegexEntityExtractor
case_sensitive: False
use_lookup_tables: True
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 21
model_confidence: linear_norm
loss_type: cross_entropy
constrain_similarities: true
evaluate_on_number_of_examples: 5
evaluate_every_number_of_epochs: 1
tensorboard_log_directory: ./.tensorboard
tensorboard_log_level: epoch
- name: FallbackClassifier
threshold: 0.2
ambiguity_threshold: 0.05
policies:
- name: AugmentedMemoizationPolicy
max_history: 8
- name: TEDPolicy
max_history: 8
epochs: 41
model_confidence: linear_norm
loss_type: cross_entropy
constrain_similarities: true
evaluate_on_number_of_examples: 200
evaluate_every_number_of_epochs: 5
tensorboard_log_directory: ./.tensorboard
tensorboard_log_level: epoch
- name: RulePolicy
core_fallback_threshold: 0.2
core_fallback_action_name: action_default_fallback
enable_fallback_prediction: true
restrict_rules: true
check_for_contradictions: true
I’m on Rasa 2.3.x.
I must admit that DIETClassifier does better than CRFEntityExtractor & EmbeddingIntentClassifier. But in my case, 30 minutes was too long. Maybe I should accept using a lower version with CRFEntityExtractor & EmbeddingIntentClassifier.
Thanks for your suggestion. I can agree with it as a workaround
1 Like