Unable to set threshold when using model_confidence: linear_norm instead of softmax

gdl1 · April 20, 2021, 1:55pm

I am using the rasa 2.5.0 on windows 10 and observe that when when using model_confidence: linear_norm the range of confidences obtained for the intent prediction reduces as the number of intent augment. A consequence of this is that it becomes impossible to set a fallback threshold. On the other hand when using softmax the nlu model seems to work just fine.

My actual problem contains 150 intents and I obtain the following histograms using linear_norm and softmax

as I cannot share the data for this model, I created a synthetic analogue which illustrate the issues:

the data to reproduce:

config.yml (823 Bytes)

domain.yml (1.0 KB)

nlu.yml (11.1 KB)

Confusion matrix obtained with linear_norm

Confusion matrix obtained with softmax

Config:

language: en

# Rasa NLU
pipeline:
- name: SpacyNLP
  model: "en_core_web_lg"
  case_sensitive: false
- name: SpacyTokenizer
- name: SpacyFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
  analyzer: "word"
- name: CountVectorsFeaturizer
  analyzer: "char_wb"
  min_ngram: 1
  max_ngram: 4
- name: DIETClassifier
  loss_type: cross_entropy
  model_confidence: linear_norm # softmax
  constrain_similarities: true
  epochs: 200
  intent_classification: true
  entity_recognition: false
  batch_strategy: balanced
- name: EntitySynonymMapper
# - name: FallbackClassifier
  # threshold: 0.90
  # ambiguity_threshold: 0.1

# Rasa Core
policies:
- name: MemoizationPolicy
- name: TEDPolicy
  max_history: 5
  epochs: 200
- name: RulePolicy

Thanks for any tips on this problem

dakshvar22 · April 22, 2021, 4:34am

Hi @gdl1 Thanks for sharing your results. Both linear_norm and softmax are two different variants you can use for model confidences. While we found linear_norm to be effective on some assistants, for example rasa-demo, it’s good to know that softmax outperforms linear_norm for some others.

We’ll be shipping out a new loss function very soon which will cosine similarities as model confidences underneath. If you are already curious, you can give the working branch of the linked PR a try.

dakshvar22 · April 22, 2021, 4:56am

In your real dataset, how many examples do you have on average per intent? In the synthetic dataset that number is low but is that reflective of the number of examples in your real dataset also? I can imagine linear_norm lagging behind softmax in low data conditions.

gdl1 · April 22, 2021, 6:33pm

Hi @dakshvar22 . Thanks for your reply. Definitely curious to see what you are preparing; I had tried the cosine similarity model that was temporarily available a few revision back. As for my ‘real’ dataset here is snapshot on the sample distributions:

What do you consider a low number of examples / intent?

Thanks

gdl1 · May 17, 2021, 2:37pm

Hi @dakshvar22. Wanted to follow-up on this thread and see if you could offer some comments on this point of number of examples per intents. Thanks

Topic		Replies	Views
Linear norm confidence score is unreliable Rasa Open Source	18	2023	May 18, 2021
Model_confidence: linear_norm Tutorials, Resources & Videos	4	2907	April 16, 2021
Fallback doesn't work with 2 retrieval intents FAQ and Chitchat Rasa Open Source	12	1231	June 17, 2023
As number of intents increases, confidence level decreases Rasa Open Source	7	2151	August 24, 2018
NLU confidence score actually probability distribution? Getting Started with Rasa	5	261	February 3, 2021

Unable to set threshold when using model_confidence: linear_norm instead of softmax

Related topics