OOV Token not work

inpemu · November 13, 2020, 10:53am

Hi, There are problems with training the chatbot. If unknown words or sentences are entered the chatbot find “Intent” that are higher than 0.3. And the corresponding answers to the “Intent” with the highest value are choosed and displayed.

According to manual the component “OOV_token” can be used to recognize unknown words. The following change was made in config.yml.

pipeline:

name: SpacyNLP case_sensitive: false
name: SpacyTokenizer use_lemma: false
name: SpacyFeaturizer
name: RegexFeaturizer
name: LexicalSyntacticFeaturizer
name: CountVectorsFeaturizer OOV_token: oov token_pattern: (?u)\b\w+\b
name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
name: DIETClassifier epochs: 100
name: EntitySynonymMapper
name: ResponseSelector epochs: 100

An “intent” was created with 4 example NLU data. But nothing has changed during the test. The answers of the highest rated “intent” are still displayed for unknown words or sentences. The found intents have a higher value than 0.3. How to solve this problem?

Tobias_Wochinger · December 17, 2020, 9:31am

The OOV_token is used to represent unknown words during predictions. It doesn’t recognize them or anything similar. E.g. if you have a sentence full of unknown words, then the classifier would be able to see that there are many unknown words and made the prediction for this.

I think the problems in your case are rather a lack of training data. In my opinion it could also help to add an intent out_of_scope with messages which are out of scope so that the DIETClassifier can learn what messages shouldn’t be mapped to any “in scope” intents.

Checkout the documentation here

Topic		Replies	Views
Customize OOV_token in CountVectorsFeaturizer? Rasa Open Source	1	1202	October 17, 2019
OOV-token for tensorflow embedding Rasa Open Source	3	1600	October 12, 2018
Use of Out of Vocabulary - OOV Rasa Open Source	9	3431	December 22, 2021
Rasa CountVectorsFeaturizer Rasa Open Source	0	245	August 23, 2021
spaCy and OOV-Tokens Rasa Open Source	1	628	July 29, 2020

OOV Token not work

Related topics