I am having a problem with rasa with the commands:
rasa train --num-threads -1
rasa test nlu --cross-validation
When running any of these commands the process randomly hangs. Sometimes it runs fine, sometines it hangs. I already checked and it is not a memory problem, but I haven’t found the issue yet.
Looking at htop I see that occupying around 100% of CPU however the memory is not full used:
@dsmendes can you please share rasa --version and your system configuration like CPU, RAM and HDD space? That’s the really interesting issue.
@dsmendes Can I ask why you required cross-validation are you comparing the NLU performance? Brief info about your use case, if you don’t mind to share. Thanks.
@dsmendes are you able to train the model easily or it also show some errors or warnings messages?
@dsmendes are you using customise pipelines in config.yml ?
I customized the suggested config. It is in the .tar in the post.
When it trains, it runs everything right, however when it hangs the logs are:
Training NLU model...
2022-01-06 09:48:11 INFO rasa.nlu.utils.spacy_utils - Trying to load spacy model with name 'en_core_web_md'
2022-01-06 09:48:13 INFO rasa.nlu.components - Added 'SpacyNLP' to component cache. Key 'SpacyNLP-en_core_web_md'.
2022-01-06 09:48:13 INFO rasa.shared.nlu.training_data.training_data - Training data stats:
2022-01-06 09:48:13 INFO rasa.shared.nlu.training_data.training_data - Number of intent examples: 1706 (14 distinct intents)
2022-01-06 09:48:13 INFO rasa.shared.nlu.training_data.training_data - Found intents: 'cost', 'deny', 'consumption_comparison', 'greet', 'cost_comparison', 'consumption', 'mood_great', 'goodbye', 'affirm', 'bot_challenge', 'nlu_fallback', 'tariff_comparison', 'mood_unhappy', 'tariff'
2022-01-06 09:48:13 INFO rasa.shared.nlu.training_data.training_data - Number of response examples: 0 (0 distinct responses)
2022-01-06 09:48:13 INFO rasa.shared.nlu.training_data.training_data - Number of entity examples: 32 (1 distinct entities)
2022-01-06 09:48:13 INFO rasa.shared.nlu.training_data.training_data - Found entity types: 'tariff_type'
2022-01-06 09:48:13 INFO rasa.nlu.model - Starting to train component SpacyNLP
2022-01-06 09:48:14 INFO rasa.nlu.model - Finished training component.
2022-01-06 09:48:14 INFO rasa.nlu.model - Starting to train component SpacyTokenizer
2022-01-06 09:48:14 INFO rasa.nlu.model - Finished training component.
2022-01-06 09:48:14 INFO rasa.nlu.model - Starting to train component RegexFeaturizer
2022-01-06 09:48:14 INFO rasa.nlu.model - Finished training component.
2022-01-06 09:48:14 INFO rasa.nlu.model - Starting to train component SpacyFeaturizer
2022-01-06 09:48:15 INFO rasa.nlu.model - Finished training component.
2022-01-06 09:48:15 INFO rasa.nlu.model - Starting to train component DucklingEntityExtractor
2022-01-06 09:48:15 INFO rasa.nlu.model - Finished training component.
2022-01-06 09:48:15 INFO rasa.nlu.model - Starting to train component RegexEntityExtractor
2022-01-06 09:48:15 INFO rasa.nlu.model - Finished training component.
2022-01-06 09:48:15 INFO rasa.nlu.model - Starting to train component CRFEntityExtractor
2022-01-06 09:48:15 INFO rasa.nlu.model - Finished training component.
2022-01-06 09:48:15 INFO rasa.nlu.model - Starting to train component EntitySynonymMapper
2022-01-06 09:48:15 INFO rasa.nlu.model - Finished training component.
2022-01-06 09:48:15 INFO rasa.nlu.model - Starting to train component SklearnIntentClassifier
Fitting 2 folds for each of 6 candidates, totalling 12 fits
# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en
pipeline:
# No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# If you'd like to customize it, uncomment and adjust the pipeline.
# See https://rasa.com/docs/rasa/tuning-your-model for more information.
- name: SpacyNLP
model: "en_core_web_md"
case_sensitive: False
- name: SpacyTokenizer
- name: "RegexFeaturizer"
"case_sensitive": False
"use_word_boundaries": True
- name: SpacyFeaturizer
- name: DucklingEntityExtractor
url: "http://duckling:8000"
dimensions: [ "time", "duration"]
locale: "en_GB"
timezone: "Europe/London"
timeout: 3
- name: RegexEntityExtractor
case_sensitive: False
use_lookup_tables: True
use_regexes: True
"use_word_boundaries": True
- name: CRFEntityExtractor
"BILOU_flag": True
"max_iterations": 50
"L1_c": 0.1
"L2_c": 0.1
"featurizers": [ ]
- name: EntitySynonymMapper
- name: SklearnIntentClassifier
C: [ 1, 2, 5, 10, 20, 100 ]
kernels: [ "linear" ]
"gamma": [ 0.1 ]
"max_cross_validation_folds": 5
"scoring_function": "f1_weighted"
- name: FallbackClassifier
threshold: 0.5
# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
# No configuration for policies was provided. The following default policies were used to train your model.
# If you'd like to customize them, uncomment and adjust the policies.
# See https://rasa.com/docs/rasa/policies for more information.
- name: RulePolicy
enable_fallback_prediction: true
core_fallback_action_name: action_default_fallback
core_fallback_threshold: 0.3
@dsmendes do you really require SVM hyperparameter Grid Search?
@dsmendes Confirm to me please that you getting any error message when it’s running SklearnIntentClassifier as I’m aware and with the experience of using hyperparameter with Grid Search, it takes a lot of time to get the best parameters selection for the same and even 5 KFold cross-validation is a lot for SVM
Fitting 2 folds for each of 6 candidates, totalling 12 fits
By this message, he is working and it’s on 2 folds so he needs to run 3 more folds, and then it will show the Finished training component.
To Cross-check try to mention only 2 Folds and check it’s giving you a Finished message or not?
The strange thing is that sometimes it finishes the process in few seconds. It is the reason that I don not understand the behaviour.
Please, see the logs bellow. In this run it worked fine.
Training NLU model...
2022-01-06 14:14:53 INFO rasa.nlu.utils.spacy_utils - Trying to load spacy model with name 'en_core_web_md'
2022-01-06 14:14:54 INFO rasa.nlu.components - Added 'SpacyNLP' to component cache. Key 'SpacyNLP-en_core_web_md'.
2022-01-06 14:14:54 INFO rasa.shared.nlu.training_data.training_data - Training data stats:
2022-01-06 14:14:54 INFO rasa.shared.nlu.training_data.training_data - Number of intent examples: 938 (7 distinct intents)
2022-01-06 14:14:54 INFO rasa.shared.nlu.training_data.training_data - Found intents: 'mood_unhappy', 'bot_challenge', 'goodbye', 'mood_great', 'deny', 'greet', 'affirm'
2022-01-06 14:14:54 INFO rasa.shared.nlu.training_data.training_data - Number of response examples: 0 (0 distinct responses)
2022-01-06 14:14:54 INFO rasa.shared.nlu.training_data.training_data - Number of entity examples: 0 (0 distinct entities)
2022-01-06 14:14:54 INFO rasa.nlu.model - Starting to train component SpacyNLP
2022-01-06 14:14:55 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:55 INFO rasa.nlu.model - Starting to train component SpacyTokenizer
2022-01-06 14:14:55 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:55 INFO rasa.nlu.model - Starting to train component RegexFeaturizer
2022-01-06 14:14:55 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:55 INFO rasa.nlu.model - Starting to train component SpacyFeaturizer
2022-01-06 14:14:55 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:55 INFO rasa.nlu.model - Starting to train component DucklingEntityExtractor
2022-01-06 14:14:55 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:55 INFO rasa.nlu.model - Starting to train component RegexEntityExtractor
2022-01-06 14:14:55 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:55 INFO rasa.nlu.model - Starting to train component CRFEntityExtractor
2022-01-06 14:14:55 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:55 INFO rasa.nlu.model - Starting to train component EntitySynonymMapper
2022-01-06 14:14:55 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:55 INFO rasa.nlu.model - Starting to train component SklearnIntentClassifier
Fitting 2 folds for each of 6 candidates, totalling 12 fits
2022-01-06 14:14:56 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:56 INFO rasa.nlu.model - Starting to train component FallbackClassifier
2022-01-06 14:14:56 INFO rasa.nlu.model - Finished training component.
2022-01-06 14:14:56 INFO rasa.nlu.model - Successfully saved model into '/tmp/tmpat9nhj6s/nlu'
NLU model training completed.
2022-01-06 14:14:58 INFO rasa.nlu.components - Added 'SpacyNLP' to component cache. Key 'SpacyNLP-en_core_web_md'.
Training Core model...
Processed story blocks: 100%|█████| 3/3 [00:00<00:00, 2818.75it/s, # trackers=1]
Processed story blocks: 100%|█████| 3/3 [00:00<00:00, 1303.79it/s, # trackers=3]
Processed story blocks: 100%|█████| 3/3 [00:00<00:00, 304.89it/s, # trackers=12]
Processed story blocks: 100%|██████| 3/3 [00:00<00:00, 89.93it/s, # trackers=39]
Processed rules: 100%|████████████| 4/4 [00:00<00:00, 3918.08it/s, # trackers=1]
Processed trackers: 100%|███████████| 4/4 [00:00<00:00, 4039.78it/s, # action=9]
Processed actions: 9it [00:00, 15847.50it/s, # examples=8]
Processed trackers: 100%|██████████| 3/3 [00:00<00:00, 2001.74it/s, # action=12]
Processed trackers: 100%|███████████████████████| 4/4 [00:00<00:00, 2895.12it/s]
Processed trackers: 100%|███████████████████████| 7/7 [00:00<00:00, 1912.21it/s]
2022-01-06 14:14:59 INFO rasa.core.agent - Persisted model to '/tmp/tmpat9nhj6s/core'
Core model training completed.
Your Rasa model is trained and saved at '/home/models/20220106-141501.tar.gz'.
@dsmendes Right, it’s a strange behaviour and I can see you have good amount of RAM for processing, try clear the cache of the system and delete the older trained model and re-train it again for 3 folds this time?
My reply can be delay as I’m facing technical issues on forum.
@nik202 I am not able to train with 2 folds. The behaviour is the same using 2 or 3 or 5 folds. Sometimes it finishes sometimes it hangs. There is no pattern here.