NLU detects random input with wrong intent and high confidence

harshitazilen · August 23, 2018, 7:21am

If I provide random input text like “…” , “???”,"-=-%$^" or “qwerty”, Rasa NLU returns very good confidence with any intent.

Such words are not part of any of the training data in any intent, yet NLU returns intents and confidence level being higher than threshold.

Can you please guide how to fix this ? I believe such words should be with lower confidence and go in side fallback intent.

souvikg10 · August 23, 2018, 12:34pm

Hi, can you please provide more info,

what is your pipeline? do you have a garbage intent?

harshitazilen · August 23, 2018, 12:54pm

I don’t have garbage intent. I tried below pipelines

pipeline:
  - name: "nlp_spacy"
  - name: "tokenizer_spacy"
  - name: "intent_featurizer_spacy"
  - name: "intent_classifier_sklearn"
  - name: "ner_crf"
  - name: "ner_synonyms"

Also recently I tried with below pipeline as well, got the same result

language: "en"

pipeline: "tensorflow_embedding"

amn41 · August 23, 2018, 1:20pm

which version are you using? as of 0.13.0 tensorflow embedding will predict None for the intent if there are no in-vocab words, see the changelog https://github.com/RasaHQ/rasa_nlu/blob/master/CHANGELOG.rst#0130---2018-08-02

harshitazilen · August 23, 2018, 1:25pm

My Current Rasa NLU version is 0.13.1 , yet facing the issue.

datistiquo · August 23, 2018, 4:51pm

@amn41Can you set it to specific treshold when None is returned. If just one in-vocab word is used maybe this makes sense.

amn41 · August 23, 2018, 7:17pm

what’s the issue exactly?

amn41 · August 23, 2018, 7:17pm

I’m not sure I understand what you mean - you mean defining a threshold on the number of out-of-vocab words?

datistiquo · August 23, 2018, 8:08pm

yes

harshitazilen · August 24, 2018, 5:31am

@amn41 NLU returns high confidence with any intent ( higher than threshold ) for random words like “???” , “qwerty” etc. It should go in None or detect low confidence (lower than threshold). We are discussion on how to achieve that.

akelad · August 24, 2018, 12:42pm

Hmm I’ve never experienced this before, I always get the None intent for random words. @Ghostvv any ideas what’s happening?

Ghostvv · August 24, 2018, 12:58pm

Are you sure you don’t have at least one in-vocab word like the for example?

harshitazilen · August 24, 2018, 1:10pm

Yes the words “–” , “-” ,"<" , “@” alone are not part of training data, they may be used in training statements with combination of other words or characters.

harshitazilen · August 24, 2018, 1:11pm

All Intents have training statements ending with “.” character. If I type “.” , the bot matches with an intent.

Ghostvv · August 24, 2018, 1:28pm

sorry, it is hard to understand this way. Could you please compose example script, so we could reproduce this issue?

harshitazilen · August 27, 2018, 2:02pm

Hi @Ghostvv , I created a sample project with 2 intents. Please find code links below. I tested by passing “???”,“qwerty” and “?” characters and it is detecting wrong intent with 0.7 - 0.8 confidence. I tested with Rasa NLU as server. Thanks.

Config file: https://pastebin.com/peyHYzNU
Domain file: https://pastebin.com/s8kmbsS5
Training JSON : https://pastebin.com/6UfXXzLL

Ghostvv · August 28, 2018, 9:05am

What version of rasa_nlu are you using? I tried with master, the classifier returns None

harshitazilen · August 28, 2018, 9:32am

I am using 0.13.1 version of Rasa NLU. Is Master stable to use ? My project is soon going to be in production phase.

harshitazilen · August 28, 2018, 9:32am

I am using 0.13.1 version of Rasa NLU. Is Master stable to use ? My project is soon going to be in production phase.

Ghostvv · August 28, 2018, 9:53am

I just tried in fresh virtual environment with rasa_nlu version 0.13.1, and I got:

INFO:tensorflow:Restoring parameters from projects/default/tf_model/intent_classifier_tensorflow_embedding.ckpt
{'intent': {'name': None, 'confidence': 0.0}, 'entities': [], 'intent_ranking': [], 'text': '???'}
{'intent': {'name': None, 'confidence': 0.0}, 'entities': [], 'intent_ranking': [], 'text': 'qwerty'}

Are you sure you use 0.13.1, could you please try creating new virtual environment with 0.13.1?

Topic		Replies	Views
NLU detects random input with wrong intent that is none with zero confidence Rasa Open Source	3	1026	March 8, 2019
Random input - intent classified with high confidence Rasa Open Source	5	644	December 22, 2020
Rasa classifies random input as intents with high probability Rasa Open Source	24	1587	April 20, 2023
Rasa NLU predicting random intent Rasa Open Source	14	1652	November 15, 2019
Wrong Confidence Rasa Open Source	1	309	February 26, 2020

NLU detects random input with wrong intent and high confidence

Related topics