Huge intent recognition problems

lindig · April 16, 2019, 2:03pm

Hello,

I have an intent goodbye with some german examples to say goodbye:

##intent:goodbye
- bis später
- tschüss
- bye

and here is my config.yml for the spacy model

language: "de_core_news_sm"

pipeline:
- name: "nlp_spacy"
- name: "tokenizer_spacy"
- name: "intent_entity_featurizer_regex"
- name: "ner_crf"
  features: [
    ["low", "title", "upper"],
    ["bias", "low", "prefix5", "prefix2", "suffix5", "suffix3", "suffix2", "upper", "title", "digit", "pattern", "pos", "pos2"],
    ["low", "title", "upper"]]
- name: "intent_featurizer_spacy"
- name: "intent_classifier_sklearn"

after training the nlu and the core I try to communicate with the bot but with some major problems. If I ask him total nonsence like syedxgvhkmijgedsfhdrwysa it falls in the intent goodbye with a confidence about 47%.

UserUttered(text: syedxgvhkmijgedsfhdrwysa, intent: {'name': 'goodbye', 'confidence': 0.4790237086878828}, entities: [])

So one method to prevent such problems is to create a fallback policy and configure it with a high nlu confidence. But are there any other methods to prevent these problems? Maybe configurations directly in spacy?

Another question: The string above is a very weird combination of many letters, absolutely with no analogy to my goodbye-examples. Why do I have such a high confidence?

mohamedHassanKa · April 16, 2019, 2:23pm

i would like to know two since i am having the same problem

egiglio · June 10, 2019, 11:17am

I’m running into the same issues and was going to make a similar post.

An example of my issues is if I have an intent such as “RegisterVisitor” and I have an utterance of “John Doe will be visiting me at 2PM” and test it with something like “Phil is visiting tomorrow at 4PM”, one of two things is happening. Either it will correctly identify the intent as RegisterVisitor but with a low confidence (.60 or lower) or it will incorrectly identify the intent.

What are some ways around this?

blacknight · June 10, 2019, 12:54pm

Hi,

I would suggest you try to replace de_core_news_sm by de_core_news_md. This could improve you intent recognition. You might as well want to add more data to your NLU. Most of the time, intent recognition or entity extraction issues come from the lack of data and stories but I could be wrong.

I don’t think that increasing the NLU confidence is the right move to do.

Regards.

JulianGerhard · June 10, 2019, 2:18pm

Hi @lindig

following @blacknight s suggestion - imagine it like a kid you want to teach the intent “goodbye”. Three samples are simply not enough to sharpen / distinguish between other intents, especially with few words in them.

To give you more advices, we would need to know your other intents to decide if they are distinct or not.

If you need more help, feel free to ask!

Topic		Replies	Views
RASA NLU pretrained embeddings not detecting intent even if the input is straight forward Rasa Open Source	8	945	September 23, 2019
Intent incorrectly recognised with high confidence Rasa Open Source	3	285	June 20, 2023
Rasa intent not recognizing when when its confidence is higher than nlu_fallback Welcome to the Rasa Community Forum!	0	51	June 21, 2024
Improper intent recognition Rasa Open Source	7	1305	September 4, 2018
As number of intents increases, confidence level decreases Rasa Open Source	7	2151	August 24, 2018

Huge intent recognition problems

Related topics