If I provide random input text like “…” , “???”,"-=-%$^" or “qwerty”, Rasa NLU returns very good confidence with any intent.
Such words are not part of any of the training data in any intent, yet NLU returns intents and confidence level being higher than threshold.
Can you please guide how to fix this ? I believe such words should be with lower confidence and go in side fallback intent.
4 Likes
souvikg10
(Souvik Ghosh)
August 23, 2018, 12:34pm
2
Hi,
can you please provide more info,
what is your pipeline?
do you have a garbage intent?
I don’t have garbage intent. I tried below pipelines
pipeline:
- name: "nlp_spacy"
- name: "tokenizer_spacy"
- name: "intent_featurizer_spacy"
- name: "intent_classifier_sklearn"
- name: "ner_crf"
- name: "ner_synonyms"
Also recently I tried with below pipeline as well, got the same result
language: "en"
pipeline: "tensorflow_embedding"
amn41
(Alan Nichol)
August 23, 2018, 1:20pm
4
which version are you using? as of 0.13.0 tensorflow embedding will predict None
for the intent if there are no in-vocab words, see the changelog https://github.com/RasaHQ/rasa_nlu/blob/master/CHANGELOG.rst#0130---2018-08-02
My Current Rasa NLU version is 0.13.1 , yet facing the issue.
@amn41Can you set it to specific treshold when None is returned. If just one in-vocab word is used maybe this makes sense.
amn41
(Alan Nichol)
August 23, 2018, 7:17pm
7
what’s the issue exactly?
amn41
(Alan Nichol)
August 23, 2018, 7:17pm
8
I’m not sure I understand what you mean - you mean defining a threshold on the number of out-of-vocab words?
@amn41 NLU returns high confidence with any intent ( higher than threshold ) for random words like “???” , “qwerty” etc. It should go in None or detect low confidence (lower than threshold). We are discussion on how to achieve that.
akelad
(Akela Drissner)
August 24, 2018, 12:42pm
11
Hmm I’ve never experienced this before, I always get the None
intent for random words. @Ghostvv any ideas what’s happening?
Ghostvv
(Vladimir Vlasov)
August 24, 2018, 12:58pm
12
Are you sure you don’t have at least one in-vocab word like the
for example?
Yes the words “–” , “-” ,"<" , “@” alone are not part of training data, they may be used in training statements with combination of other words or characters.
All Intents have training statements ending with “.” character. If I type “.” , the bot matches with an intent.
Ghostvv
(Vladimir Vlasov)
August 24, 2018, 1:28pm
15
sorry, it is hard to understand this way. Could you please compose example script, so we could reproduce this issue?
Hi @Ghostvv , I created a sample project with 2 intents. Please find code links below. I tested by passing “???”,“qwerty” and “?” characters and it is detecting wrong intent with 0.7 - 0.8 confidence. I tested with Rasa NLU as server.
Thanks.
Ghostvv
(Vladimir Vlasov)
August 28, 2018, 9:05am
17
What version of rasa_nlu
are you using? I tried with master
, the classifier returns None
I am using 0.13.1 version of Rasa NLU. Is Master stable to use ? My project is soon going to be in production phase.
I am using 0.13.1 version of Rasa NLU. Is Master stable to use ? My project is soon going to be in production phase.
Ghostvv
(Vladimir Vlasov)
August 28, 2018, 9:53am
20
I just tried in fresh virtual environment with rasa_nlu version 0.13.1
, and I got:
INFO:tensorflow:Restoring parameters from projects/default/tf_model/intent_classifier_tensorflow_embedding.ckpt
{'intent': {'name': None, 'confidence': 0.0}, 'entities': [], 'intent_ranking': [], 'text': '???'}
{'intent': {'name': None, 'confidence': 0.0}, 'entities': [], 'intent_ranking': [], 'text': 'qwerty'}
Are you sure you use 0.13.1
, could you please try creating new virtual environment with 0.13.1
?