Hello , please can anyone help me , Im using Rasa for months now and I started having an issue recently , I actually built a model based on 5 different intents (with different amount of data) and the model is working fine when I use same words from data BUT as soon as I try a new sentence which normally is not in the scope , it gives me the intent ‘small_talk’ which is one of the 5 intents with a very high confidence . That’s noot logical so can anyone tell me what should I do . Should I change the hyperparameters of change the classifier (Im using the embedding classifier) . Thank you !
what version of Rasa Open Source and what config are you using?
I have the 1.8.2 version of Rasa with this config file pipeline:
- name: “WhitespaceTokenizer_omran_ar” case_sensitive: false
- name: “CountVectorsFeaturizer”
- name: “EmbeddingIntentClassifier” batch_strategy: sequence
- name: “Extracteur_omran”
policies:
- name: EmbeddingPolicy max_history: 5 epochs: 200 batch_size: 50
- name: “MemoizationPolicy” max_history: 5
- name: “FallbackPolicy” nlu_threshold: 0.4 core_threshold: 0.3
- name: MappingPolicy
Given your config the behavior is very logical. CountVectorsFeaturizer
creates vocabulary from the training data. If your input sentence consist of unseen words, it ignores them. Please take a look here Components at Handling Out-Of-Vocabulary (OOV) words:
or try using some pretrained embeddings in addition: Choosing a Pipeline
Im sorry but I just read the article about handling oov and I still don’t get it , I tried this solution where I added the word oov in the data of a new intent ‘outofcontext’ and added this line - name: “intent_featurizer_count_vectors” OOV_token: oov as I saw in a post . Did I do something wrong or is the solution not complete cause I dont really get the other solution about pretrained embeddings
sounds correct, I would add a couple of examples with different amount of oov
words
I still have the same error actually ,I actually added the oov words in the data I’ll show u some I think there is something wrong cause it’s still not working a lot of insignificant sentences get associated to other intents even if I added the ‘outofcontext’ intent that contains data {“text”: “peux tu oov”, “intent”: “outofcontext”,} , {“text”: “oov oov”,“intent”: “outofcontext”, } can you please give any kind of advice to make it right please ?
did adding oov
improve the situation?
No it didn’t , it’s still associating new phrases to other intents is there any other solution or is it not working because I didn’t put much data in the ooc intent ?
I understand it didn’t solve the problem, but do you see any improvements?
No improvements , I added more data and still no improvement