NLU Fallback is taking precedence to other intents irrespective of threshold value and their confidence score

Hi All,

I have seen different behaviour of RASA, NLU Fallback is taking precedence to other intents irrespective of threshold value and their confidence score.

Please find my config.yml below along with rasa shell nlu output

language: en

pipeline:
  - name: SpacyNLP
    model: "en_core_web_md"
    case_sensitive: False
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 150
    constrain_similarities: True
    model_confidence: linear_norm
    entity_recognition: False
  - name: RegexEntityExtractor
    use_lookup_tables: True
    use_regexes: True
  - name: EntitySynonymMapper
  - name: ResponseSelector
    retrieval_intent: app_cdpr_preference
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_search_site
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_show_hide
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_export
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_360sv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_3dcv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cbpc_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_dgf_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: FallbackClassifier
    threshold: 0.12
    ambiguity_threshold: 0.1

policies:
- name: RulePolicy
  core_fallback_threshold: 0.12
  core_fallback_action_name: "action_default_fallback"
  enable_fallback_prediction: True
- max_history: 6
  name: AugmentedMemoizationPolicy
- name: TEDPolicy
  max_history: 8
  epochs: 100
  constrain_similarities: true
  model_confidence: linear_norm

RASA shell nlu output

what is filter
{                                    
  "text": "what is filter",          
  "intent": {                        
    "name": "nlu_fallback",          
    "confidence": 0.12               
  },                                 
  "entities": [],                    
  "intent_ranking": [                
    {                                
      "name": "nlu_fallback",        
      "confidence": 0.12             
    },                               
    {                                
      "id": 3483215150281270894,     
      "name": "app_cdpr_filter_info",
      "confidence": 0.20301519334316254
    }	
what is cdpr
{
  "text": "what is cdpr",
  "intent": {
    "name": "nlu_fallback",
    "confidence": 0.12
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "nlu_fallback",
      "confidence": 0.12
    },
    {
      "id": 8082360785938773235,
      "name": "app_cdpr_page",
      "confidence": 0.1920752078294754
    }
what is preference
{
  "text": "what is preference",
  "intent": {
    "id": 6189044326756256420,
    "name": "app_cdpr_preference",
    "confidence": 1.0
  },
  "entities": [],
  "intent_ranking": [
    {
      "id": 6189044326756256420,
      "name": "app_cdpr_preference",
      "confidence": 1.0
    },
    {
      "id": -3499082440311295765,
      "name": "out_of_scope",
      "confidence": 0.0
    },
    {
      "id": -8417286893773013388,
      "name": "app_360sv_info",
      "confidence": 0.0
    }

As I can see in rasa shell nlu output given above

  • For text : “what is filter” nlu_fallback is taking precedence over app_cdpr_filter_info even after having low confidence.

  • For text : “what is cdpr” nlu_fallback is taking precedence over app_cdpr_page even after having low confidence.

  • For text : “what is preference” app_cdpr_preference is taking precedence over nlu_fallback which is fine and expected.

Can someone please help me on this, where I am doing wrong.

Thanks in Advance :slightly_smiling_face:

Can someone please help me on this. It’s impacting my release.

Thanks in Advance :slightly_smiling_face:

Hi @nik202,

Can you please help me on this.

@naveensiwas first increase the threshold as mentioned in the rasa doc maybe 0.40 - 0.50 and the same goes with policies. Further, use either one fallback i.e pipeline or policies respectively and train the model. Delete all the previous trained models. Do let me know the result and please update the post with what is your latest rasa version. Good luck!

@nik202 Thank you so much for the response, As suggested by you I have increased the threshold value to .50 and the kept it inside pipeline and removed from policies. But still my intent is being classified with very low confidence (screenshots below).

Capture

I am using RASA 2.8.14 version. Should I try different pipeline component as suggested here, so that intent will get classified with high confidence.

Well, I am trying what you have done for retriveal_intent and you also mentioned the ambiguity threshold. Do check the pipeline please, as your model is confused. @naveensiwas

1 Like

Hi @nik202,

As of now I am checking NLU Performance by cross-validation (rasa test nlu --cross-validation) using this config.yml (mention below) to get clarity on below mentioned points.

  • There might be some overlapping in training data, which is causing low Intent confidence.
  • Second is to know Intent and Entity extraction performance.
language: en
pipeline:
  - name: SpacyNLP
    model: "en_core_web_md"
    case_sensitive: False
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    batch_strategy: sequence
    epochs: 25
    ranking_length: 5
    constrain_similarities: True
    model_confidence: linear_norm
    entity_recognition: False
  - name: RegexEntityExtractor
    use_lookup_tables: True
    use_regexes: True
  - name: EntitySynonymMapper
  - name: ResponseSelector
    retrieval_intent: app_cdpr_preference
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_search_site
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_show_hide
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_export
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_360sv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_3dcv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cbpc_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_dgf_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: FallbackClassifier
    threshold: 0.50
    ambiguity_threshold: 0.1
policies:
- name: RulePolicy
- name: AugmentedMemoizationPolicy
- name: TEDPolicy
  max_history: 10
  epochs: 20
  batch_size:
  - 32
  - 64
  constrain_similarities: true
  model_confidence: linear_norm

Post this report I might test my NLU with multiple config files to know best pipeline component for my model.

@naveensiwas make sense and as expected. Start with generic pipeline as mentioned in Rasa doc without using spacy and then proceed with other pipelines.

Do update me the progress. Good luck

1 Like

cross-validation report with 5 fold is here for the config.yml mention below

language: en
pipeline:
  - name: SpacyNLP
    model: "en_core_web_md"
    case_sensitive: False
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    batch_strategy: sequence
    epochs: 25
    ranking_length: 5
    constrain_similarities: True
    model_confidence: linear_norm
    entity_recognition: False
  - name: RegexEntityExtractor
    use_lookup_tables: True
    use_regexes: True
  - name: EntitySynonymMapper
  - name: ResponseSelector
    retrieval_intent: app_cdpr_preference
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_search_site
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_show_hide
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_export
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cdpr_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_360sv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_3dcv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_cbpc_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: ResponseSelector
    retrieval_intent: app_dgf_info
    epochs: 100
    constrain_similarities: True
    model_confidence: linear_norm
  - name: FallbackClassifier
    threshold: 0.50
    ambiguity_threshold: 0.1
policies:
- name: RulePolicy
- name: AugmentedMemoizationPolicy
- name: TEDPolicy
  max_history: 10
  epochs: 20
  batch_size:
  - 32
  - 64
  constrain_similarities: true
  model_confidence: linear_norm

Intent Confusion Matrix

Intent Histogram

RegexEntityExtractor Confusion Matrix

Response Selection Confusion Matrix

Response Selection Histogram

@nik202 any suggestions to be based on this report ?

Hi @nik202,

I have changed my model_confidence: linear_norm to model_confidence: softmax and it increased the intent confidence close to .9 or .8 and I think it will help me.

As of now, I am Comparing NLU Pipelines by using 3 config file, which are give below :

Config_1.yml

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/2.x/components/
language: "en"
pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    batch_strategy: sequence
    epochs: 25
    ranking_length: 5
    constrain_similarities: True
    model_confidence: softmax
    entity_recognition: False
  - name: RegexEntityExtractor
    use_lookup_tables: True
    use_regexes: True
  - name: EntitySynonymMapper
  - name: ResponseSelector
    retrieval_intent: app_cdpr_preference
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_search_site
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_show_hide
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_export
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_360sv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_3dcv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cbpc_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_dgf_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: FallbackClassifier
    threshold: 0.50
    ambiguity_threshold: 0.1

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/2.x/policies/
policies:
- name: RulePolicy
- name: AugmentedMemoizationPolicy
- name: TEDPolicy
  max_history: 10
  epochs: 20
  batch_size:
  - 32
  - 64
  constrain_similarities: true
  model_confidence: softmax

Config_2.yml

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/2.x/components/
language: "en"
pipeline:
  - name: SpacyNLP
    model: "en_core_web_md"
    case_sensitive: False
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    batch_strategy: sequence
    epochs: 25
    ranking_length: 5
    constrain_similarities: True
    model_confidence: softmax
    entity_recognition: False
  - name: RegexEntityExtractor
    use_lookup_tables: True
    use_regexes: True
  - name: EntitySynonymMapper
  - name: ResponseSelector
    retrieval_intent: app_cdpr_preference
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_search_site
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_show_hide
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_export
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_360sv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_3dcv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cbpc_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_dgf_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: FallbackClassifier
    threshold: 0.50
    ambiguity_threshold: 0.1

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/2.x/policies/
policies:
- name: RulePolicy
- name: AugmentedMemoizationPolicy
- name: TEDPolicy
  max_history: 10
  epochs: 20
  batch_size:
  - 32
  - 64
  constrain_similarities: true
  model_confidence: softmax

Config_3.yml

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/2.x/components/
language: "en"
pipeline:
  - name: ConveRTTokenizer
  - name: ConveRTFeaturizer
    model_url: "./convert_model/v1.0/model"
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    batch_strategy: sequence
    epochs: 25
    ranking_length: 5
    constrain_similarities: True
    model_confidence: softmax
    entity_recognition: False
  - name: RegexEntityExtractor
    use_lookup_tables: True
    use_regexes: True
  - name: EntitySynonymMapper
  - name: ResponseSelector
    retrieval_intent: app_cdpr_preference
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_search_site
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_show_hide
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_export
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cdpr_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_360sv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_3dcv_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_cbpc_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: ResponseSelector
    retrieval_intent: app_dgf_info
    epochs: 100
    constrain_similarities: True
    model_confidence: softmax
  - name: FallbackClassifier
    threshold: 0.50
    ambiguity_threshold: 0.1

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/2.x/policies/
policies:
- name: RulePolicy
- name: AugmentedMemoizationPolicy
- name: TEDPolicy
  max_history: 10
  epochs: 20
  batch_size:
  - 32
  - 64
  constrain_similarities: true
  model_confidence: softmax

When I trigger this command

rasa test nlu --nlu data/ --config config_1.yml config_2.yml config_3.yml

It will test NUL for 3 run and different exclusion percentages of train data. Which might take huge amount of time depends on size on training data.

It there any way to speedup this test ?

I am also thinking to add some NLU test in my CI/CD pipeline, which will take more time to complete the job.