It’s weird when I’m in training mode RASA finds the prediction of intention, entities… However when I’m in shell mode, the accuracy is bad.
my configuration
language: fr
pipeline:
- name: WhitespaceTokenizer
case_sensitive: false
- name: CRFEntityExtractor
BILOU_flag: true
features:
- - low
- title
- upper
- - bias
- low
- prefix5
- prefix2
- suffix5
- suffix3
- suffix2
- upper
- title
- digit
- pattern
- - low
- title
- upper
- name: EntitySynonymMapper
- name: CountVectorsFeaturizer
intent_tokenization_flag: true
intent_split_symbol: +
- name: EmbeddingIntentClassifier
- name: RegexFeaturizer
- name: "DucklingHTTPExtractor"
url: "http://localhost:8000"
dimensions: ["time", "number", "amount-of-money", "distance"]
locale: "fr_FR"
timezone: "Europe/Paris"
timeout : 3
policies:
- name: KerasPolicy
epochs: 700
batch_size: 100
featurizer:
- name: MaxHistoryTrackerFeaturizer
max_history: 5
state_featurizer:
- name: BinarySingleStateFeaturizer
- name: MemoizationPolicy
max_history: 5
- name: FallbackPolicy
nlu_threshold: 0.7
core_threshold: 0.4
fallback_action_name: utter_oupsomethingfailed
- name: FormPolicy
rasa test result
valentin@mbp-de-valentin archelot % rasa test
2020-01-27 20:09:25 INFO absl - Entry Point [tensor2tensor.envs.tic_tac_toe_env:TicTacToeEnv] registered with id [T2TEnv-TicTacToeEnv-v0]
2020-01-27 20:09:25 INFO rasa.core.policies.ensemble - MappingPolicy not included in policy ensemble. Default intents 'restart and back will not trigger actions 'action_restart' and 'action_back'.
Processed Story Blocks: 0%| | 0/29 [00:00<?, ?it/s, # trackers=1]/usr/local/lib/python3.7/site-packages/rasa/core/slots.py:217: UserWarning: Categorical slot 'sexe' is set to a value ('femmme') that is not specified in the domain. Value will be ignored and the slot will behave as if no value is set. Make sure to add all values a categorical slot should store to the domain.
f"Categorical slot '{self.name}' is set to a value "
Processed Story Blocks: 100%|███████████████████████████████████████████████████████| 29/29 [00:00<00:00, 629.86it/s, # trackers=1]
2020-01-27 20:09:25 INFO rasa.core.test - Evaluating 14 stories
Progress:
100%|██████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:01<00:00, 10.88it/s]
2020-01-27 20:09:26 INFO rasa.core.test - Finished collecting predictions.
2020-01-27 20:09:26 INFO rasa.core.test - Evaluation Results on CONVERSATION level:
2020-01-27 20:09:26 INFO rasa.core.test - Correct: 12 / 14
2020-01-27 20:09:26 INFO rasa.core.test - F1-Score: 0.923
2020-01-27 20:09:26 INFO rasa.core.test - Precision: 1.000
2020-01-27 20:09:26 INFO rasa.core.test - Accuracy: 0.857
2020-01-27 20:09:26 INFO rasa.core.test - In-data fraction: 0.976
2020-01-27 20:09:26 INFO rasa.core.test - Evaluation Results on ACTION level:
2020-01-27 20:09:26 INFO rasa.core.test - Correct: 244 / 246
2020-01-27 20:09:26 INFO rasa.core.test - F1-Score: 0.992
2020-01-27 20:09:26 INFO rasa.core.test - Precision: 0.994
2020-01-27 20:09:26 INFO rasa.core.test - Accuracy: 0.992
2020-01-27 20:09:26 INFO rasa.core.test - In-data fraction: 0.976
2020-01-27 20:09:26 INFO rasa.core.test - Classification report:
precision recall f1-score support
utter_thanks 1.00 1.00 1.00 6
utter_ask_precision_s 1.00 1.00 1.00 6
utter_favoris_ask_train 1.00 1.00 1.00 14
utter_onboarding_crush 1.00 1.00 1.00 14
utter_onboarding_mission 1.00 1.00 1.00 14
utter_goodbye 1.00 1.00 1.00 3
form_find_someone 1.00 1.00 1.00 6
action_reset_slot_find_someone 1.00 1.00 1.00 6
utter_onboarding_limit 1.00 1.00 1.00 14
utter_show_menu 1.00 0.86 0.92 14
utter_greet 1.00 1.00 1.00 14
utter_interest_find_someone_false 1.00 1.00 1.00 6
utter_interest_find_someone_true 1.00 1.00 1.00 8
utter_onboarding_goal 1.00 1.00 1.00 14
action_ask_favoris_city 1.00 1.00 1.00 14
utter_iamabot 1.00 1.00 1.00 1
action_listen 1.00 1.00 1.00 72
utter_resume_favoris_all 0.75 1.00 0.86 6
action_check_itineraire 1.00 1.00 1.00 8
utter_resume_favoris_city 1.00 1.00 1.00 6
micro avg 0.99 0.99 0.99 246
macro avg 0.99 0.99 0.99 246
weighted avg 0.99 0.99 0.99 246
2020-01-27 20:09:27 INFO rasa.nlu.test - Confusion matrix, without normalization:
[[14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 72 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 14 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 14 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 14 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 12 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6]]
2020-01-27 20:09:31 INFO rasa.nlu.test - Running model for predictions:
100%|████████████████████████████████████████████████████████████████████████████████████████████| 295/295 [00:03<00:00, 75.70it/s]
2020-01-27 20:09:35 INFO rasa.nlu.test - Intent evaluation results:
2020-01-27 20:09:35 INFO rasa.nlu.test - Intent Evaluation: Only considering those 295 examples that have a defined intent out of 295 examples
2020-01-27 20:09:35 INFO rasa.nlu.test - Classification report saved to results/intent_report.json.
2020-01-27 20:09:35 INFO rasa.nlu.test - Incorrect intent predictions saved to results/intent_errors.json.
2020-01-27 20:09:35 INFO rasa.nlu.test - Confusion matrix, without normalization:
[[16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 26 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 19 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 67 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 12 0 1 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 15 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 15 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 13 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 1 0 0 0 0 4 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0]
[ 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 10 0]
[ 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 19]]
2020-01-27 20:09:38 INFO rasa.nlu.test - Entity evaluation results:
2020-01-27 20:09:38 INFO rasa.nlu.test - Evaluation for entity extractor: CRFEntityExtractor
2020-01-27 20:09:38 INFO rasa.nlu.test - Classification report for 'CRFEntityExtractor' saved to 'results/CRFEntityExtractor_report.json'.
2020-01-27 20:09:38 INFO rasa.nlu.test - Incorrect entity predictions saved to results/CRFEntityExtractor_errors.json.
Why if the training is good in these predictions, isn’t it the case when I discuss with the bot? Is my setup bad? Would spacy be better? What similar configuration would apply?
Thanks for tips.