Rasa 1.10 dietclassifier cpu and speed issues

Hi everyone, we recently migrated from rasa 1.6 to rasa 1.10.12. We are experiencing cpu issues (requiring more than 4 cores) and training speed issues (over ~30 minutes to train). In many cases the training just hangs as well. Our 1.6 pipeline never had cpu issues and finished training our model in under 10 minutes. Could this be an issue with the diet classifier? We are doing all our training on a CPU and don’t have access to a GPU. I attached both or 1.6 pipeline and our 1.10 pipeline below. We also have a fairly large amount of training data with about ~4,000 unique intent examples and 384 stories.

Rasa 1.6 pipeline

language: "en"

pipeline:
- name: "SpacyNLP"
  model: "en_core_web_md"
  case_sensitive: false
- name: "WhitespaceTokenizer"
  case_sensitive: false
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "CountVectorsFeaturizer"
  analyzer: "char_wb"
  min_ngram: 1 
  max_ngram: 4
- name: "EmbeddingIntentClassifier"
  
policies:
  - name: "FormPolicy"
  - name: "KerasPolicy"
    epochs: 150
    featurizer:
      - name: MaxHistoryTrackerFeaturizer
        max_history: 4
        state_featurizer:
          - name: BinarySingleStateFeaturizer
  - name: "MemoizationPolicy"
    max_history: 4
  - name: "FallbackPolicy"
    nlu_threshold: 0.3
    core_threshold: 0.4
    ambiguity_threshold: 0.01
    fallback_action_name: 'action_custom_fallback'

Rasa 1.10.12 pipeline

language: "en"

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    hidden_layers_sizes:
      text: [256, 128]
    number_of_transformer_layers: 0
    weight_sparsity: 0
    intent_classification: True
    entity_recognition: False
    use_masked_language_model: False
    BILOU_flag: False
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100



policies:
  - name: "FormPolicy"
  - name: "KerasPolicy"
    epochs: 150
    featurizer:
      - name: MaxHistoryTrackerFeaturizer
        max_history: 4
        state_featurizer:
          - name: BinarySingleStateFeaturizer
  - name: "MemoizationPolicy"
    max_history: 4
  - name: "FallbackPolicy"
    nlu_threshold: 0.3
    core_threshold: 0.4
    ambiguity_threshold: 0.01
    fallback_action_name: 'action_custom_fallback'

Thank you for your help!

@amittallapragada Thanks for raising your concern. I don’t have a good explanation for this. We have seen a slight increase in training time, but from 10 to 30 minutes is unlikely. I have a couple of observations and questions.

I guess you saw on the training output that the DIETClassifier was causing this increase in training time? Did the training time for the other components stay the same?

What about the model performance itself? Does the 1.10 model perform as good as the model from Rasa 1.6?

In many cases the training just hangs as well.

What exactly do you mean?

In your Rasa 1.6 pipeline you were using a CRFEntityExtractor. You don’t use any entity extractor component anymore in the Rasa 1.10 pipeline. Is that on purpose? You could either just add the component again or use the entity extraction functionality of the DIETClassifier.

Your Rasa 1.10 pipeline contains the featurizer LexicalSyntacticFeaturizer. This might create a lot of features that do not add any benefit. Can you please try to remove this component and train again? However, if you want to use the entity extraction functionality of DIETClassifier, it might be a good idea to keep this component.

If you are not using any entity extractor in your pipeline in Rasa 1.10, there is no need to keep the EntitySynonymMapper.

Do you have training data for the ResponseSelector? If not, you can also remove this component in the Rasa 1.10 pipeline.