Rasa Version: 1.10.11
Hi, I am trying to extract a person’s name from a word or sentence and set it into a slot.
I want the bot to handle all the following 5 cases.
example 1:
Bot: Please enter your name?
User: Akhil
example 2:
Bot: Please enter your name?
User: asldfkaskdfh
example 3:
Bot: Please enter your name?
User: My name is Akhil.
example 4:
Bot: Please enter your name?
User: My name is kjasdhfkkjasdf
example 5:
Bot: Please enter your name?
User: It’s asdfasdf
I’ve seen the use of OOV token in Sara bot and implemented in a similar way but I couldn’t extract names for all Indian names.
The output of rasa shell nlu
:
Next message:
my name is sai
{
"intent": {
"name": "inform",
"confidence": 0.9997296333312988
},
"entities": [],
"intent_ranking": [
{
"name": "inform",
"confidence": 0.9997296333312988
},
{
"name": "claim_status_enquiry",
"confidence": 0.0002703829959500581
}
],
"text": "my name is sai"
}
Next message:
my name is linga
{
"intent": {
"name": "inform",
"confidence": 0.9999778270721436
},
"entities": [],
"intent_ranking": [
{
"name": "inform",
"confidence": 0.9999778270721436
},
{
"name": "claim_status_enquiry",
"confidence": 2.2214911950868554e-05
}
],
"text": "my name is linga"
}
Next message:
I get the following logs while training
(tensorflow) PS O:\Office\Chatbot\HealthCareChatbot\Chatbot\latest_vtest_v2> rasa train --debug
2020-09-02 14:38:38 DEBUG rasa.nlu.training_data.loading - Training data format of 'data\core\healthcare.md' is 'unk'.
2020-09-02 14:38:38 DEBUG rasa.nlu.training_data.loading - Training data format of 'data\nlu\healthcare.md' is 'md'.
2020-09-02 14:38:38 DEBUG rasa.nlu.training_data.loading - Training data format of 'data\nlu\inform.md' is 'md'.
2020-09-02 14:38:38 DEBUG pykwalify.compat - Using yaml library: c:\users\akhilesh\.conda\envs\tensorflow\lib\site-packages\ruamel\yaml\__init__.py
2020-09-02 14:38:39 DEBUG rasa.nlu.training_data.loading - Training data format of 'data\nlu\healthcare.md' is 'md'.
2020-09-02 14:38:39 DEBUG rasa.nlu.training_data.loading - Training data format of 'data\nlu\inform.md' is 'md'.
2020-09-02 14:38:39 DEBUG rasa.nlu.training_data.loading - Training data format of 'data\nlu\healthcare.md' is 'md'.
2020-09-02 14:38:39 DEBUG rasa.nlu.training_data.loading - Training data format of 'data\nlu\inform.md' is 'md'.
2020-09-02 14:38:41 DEBUG rasa.model - Extracted model to 'C:\Users\Akhilesh\AppData\Local\Temp\tmpmsrdo7jr'.
2020-09-02 14:38:42 INFO rasa.model - Data (version) for Core model section changed.
2020-09-02 14:38:42 INFO rasa.model - Data (version) for NLU model section changed.
Training Core model...
2020-09-02 14:38:53 DEBUG rasa.core.nlg.generator - Instantiated NLG to 'TemplatedNaturalLanguageGenerator'.
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Generated trackers will be deduplicated based on their unique last 5 states.
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Number of augmentation rounds is 3
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Starting data generation round 0 ... (with 1 trackers)
Processed Story Blocks: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<?, ?it/s, # trackers=1]
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Finished phase (1 training samples found).
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Data generation rounds finished.
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Found 0 unused checkpoints
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Starting augmentation round 0 ... (with 1 trackers)
Processed Story Blocks: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<?, ?it/s, # trackers=1]
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Finished phase (2 training samples found).
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Starting augmentation round 1 ... (with 2 trackers)
Processed Story Blocks: 100%|███████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 969.33it/s, # trackers=1]
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Finished phase (4 training samples found).
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Starting augmentation round 2 ... (with 3 trackers)
Processed Story Blocks: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<?, ?it/s, # trackers=1]
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Finished phase (6 training samples found).
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Found 6 training trackers.
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - Subsampled to 5 augmented training trackers.
2020-09-02 14:38:53 DEBUG rasa.core.training.generator - There are 1 original trackers.
2020-09-02 14:38:53 DEBUG rasa.core.agent - Agent trainer got kwargs: {}
2020-09-02 14:38:53 DEBUG rasa.core.featurizers - Creating states and action examples from collected trackers (by MaxHistoryTrackerFeaturizer(NoneType))...
Processed trackers: 100%|████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 256.41it/s, # actions=7]
2020-09-02 14:38:53 DEBUG rasa.core.featurizers - Created 7 action examples.
Processed actions: 7it [00:00, 87.17it/s, # examples=7]
2020-09-02 14:38:53 DEBUG rasa.core.policies.memoization - Memorized 7 unique examples.
2020-09-02 14:38:53 DEBUG rasa.core.featurizers - Creating states and action examples from collected trackers (by MaxHistoryTrackerFeaturizer(LabelTokenizerSingleStateFeaturizer))...
Processed trackers: 100%|████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 82.10it/s, # actions=11]
2020-09-02 14:38:53 DEBUG rasa.core.featurizers - Created 11 action examples.
2020-09-02 14:39:09 DEBUG rasa.utils.tensorflow.models - Building tensorflow train graph...
2020-09-02 14:39:27 DEBUG rasa.utils.tensorflow.models - Finished building tensorflow train graph.
Epochs: 100%|█████████████████████████████████████████████████████████████████████████| 100/100 [00:11<00:00, 8.73it/s, t_loss=0.133, loss=0.060, acc=1.000]
2020-09-02 14:39:39 INFO rasa.utils.tensorflow.models - Finished training.
2020-09-02 14:39:39 DEBUG rasa.core.featurizers - Creating states and action examples from collected trackers (by MaxHistoryTrackerFeaturizer(NoneType))...
Processed trackers: 100%|████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 462.03it/s, # actions=7]
2020-09-02 14:39:39 DEBUG rasa.core.featurizers - Created 7 action examples.
2020-09-02 14:39:39 DEBUG rasa.core.policies.memoization - Memorized 0 unique examples.
2020-09-02 14:39:40 INFO rasa.core.agent - Persisted model to 'C:\Users\Akhilesh\AppData\Local\Temp\tmpo7v9uef3\core'
Core model training completed.
Training NLU model...
2020-09-02 14:39:42 INFO rasa.nlu.utils.spacy_utils - Trying to load spacy model with name 'en'
2020-09-02 14:40:14 INFO rasa.nlu.components - Added 'SpacyNLP' to component cache. Key 'SpacyNLP-en'.
2020-09-02 14:40:14 DEBUG rasa.nlu.training_data.loading - Training data format of 'data\nlu\healthcare.md' is 'md'.
2020-09-02 14:40:14 DEBUG rasa.nlu.training_data.loading - Training data format of 'data\nlu\inform.md' is 'md'.
2020-09-02 14:40:14 INFO rasa.nlu.training_data.training_data - Training data stats:
2020-09-02 14:40:14 INFO rasa.nlu.training_data.training_data - Number of intent examples: 360 (2 distinct intents)
2020-09-02 14:40:14 INFO rasa.nlu.training_data.training_data - Found intents: 'inform', 'claim_status_enquiry'
2020-09-02 14:40:14 INFO rasa.nlu.training_data.training_data - Number of response examples: 0 (0 distinct responses)
2020-09-02 14:40:14 INFO rasa.nlu.training_data.training_data - Number of entity examples: 323 (3 distinct entities)
2020-09-02 14:40:14 INFO rasa.nlu.training_data.training_data - Found entity types: 'npi', 'claim_id', 'name'
2020-09-02 14:40:14 DEBUG rasa.nlu.training_data.training_data - Validating training data...
2020-09-02 14:40:14 INFO rasa.nlu.model - Starting to train component SpacyNLP
2020-09-02 14:40:15 INFO rasa.nlu.model - Finished training component.
2020-09-02 14:40:15 INFO rasa.nlu.model - Starting to train component SpacyTokenizer
2020-09-02 14:40:15 INFO rasa.nlu.model - Finished training component.
2020-09-02 14:40:15 INFO rasa.nlu.model - Starting to train component SpacyFeaturizer
2020-09-02 14:40:15 INFO rasa.nlu.model - Finished training component.
2020-09-02 14:40:15 INFO rasa.nlu.model - Starting to train component RegexFeaturizer
2020-09-02 14:40:15 INFO rasa.nlu.model - Finished training component.
2020-09-02 14:40:15 INFO rasa.nlu.model - Starting to train component LexicalSyntacticFeaturizer
2020-09-02 14:40:15 INFO rasa.nlu.model - Finished training component.
2020-09-02 14:40:15 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer
2020-09-02 14:40:15 DEBUG rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - No text provided for response attribute in any messages of training data. Skipping training a CountVectorizer for it.
2020-09-02 14:40:16 INFO rasa.nlu.model - Finished training component.
2020-09-02 14:40:16 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer
2020-09-02 14:40:16 DEBUG rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - No text provided for response attribute in any messages of training data. Skipping training a CountVectorizer for it.
2020-09-02 14:40:16 INFO rasa.nlu.model - Finished training component.
2020-09-02 14:40:16 INFO rasa.nlu.model - Starting to train component DIETClassifier
2020-09-02 14:40:17 DEBUG rasa.utils.tensorflow.models - Building tensorflow train graph...
2020-09-02 14:40:45 DEBUG rasa.utils.tensorflow.models - Finished building tensorflow train graph.
Epochs: 100%|█████████████████████████████████| 100/100 [01:01<00:00, 1.63it/s, t_loss=0.784, i_loss=0.001, entity_loss=0.004, i_acc=1.000, entity_f1=0.989]
2020-09-02 14:41:47 INFO rasa.utils.tensorflow.models - Finished training.
2020-09-02 14:41:47 INFO rasa.nlu.model - Finished training component.
2020-09-02 14:41:47 INFO rasa.nlu.model - Starting to train component EntitySynonymMapper
2020-09-02 14:41:47 INFO rasa.nlu.model - Finished training component.
2020-09-02 14:41:49 INFO rasa.nlu.model - Successfully saved model into 'C:\Users\Akhilesh\AppData\Local\Temp\tmpo7v9uef3\nlu'
NLU model training completed.
Your Rasa model is trained and saved at 'O:\Office\Chatbot\HealthCareChatbot\Chatbot\latest_vtest_v2\models\20200902-144150.tar.gz'.
(
## intent:inform
- My name is [James](name)
- my name is [Leota](name)
- Ok, it is [Minna](name)
- Its [Donette](name)
- It is [Abel](name)
- My name is oov
- my name is oov
- Ok, it is oov
- Its oov
- It is oov
- oov
- [Louis](name)
- [Josephine](name)
- [Lenna](name)
- [Mitsue](name)
- [Sage](name)
- [Kris](name)
- [Kiley](name)
- [Graciela](name)
My config.yml
language: en
pipeline:
- name: SpacyNLP
case_sensitive: False
- name: SpacyTokenizer
- name: SpacyFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
OOV_token: oov
token_pattern: (?u)\b\w+\b
- name: CountVectorsFeaturizer
analyzer: char_wb
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 100
- name: EntitySynonymMapper
policies:
- name: MemoizationPolicy
- name: TEDPolicy
max_history: 5
epochs: 100
- name: MappingPolicy
- name: FormPolicy
my slot mappings in actions.py
def slot_mappings(self) -> Dict[Text, Union[Dict, List[Dict]]]:
"""A dictionary to map required slots to
- an extracted entity
- intent: value pairs
- a whole message
or a list of them, where a first match will be picked"""
return {
"person_name": [
self.from_entity(entity="name"),
self.from_text(intent="inform"),
]
}