Retrieval Intents has confidence > 1

Hi, I’m new to rasa. I’m attempting to have FallbackClassfier of 0.7 . However, i saw that NLU predictions for my retrieval intents usually exceed 1. For example: This is my tracker message:

{'intent': {'id': -3716504755430144068, 'name': 'mon_hoc', 'confidence': 14.042401313781738}, 'entities': [], 'text': 'dfjffjffffff', 'message_id': '382aae11db4d4138b71764337ca269d0', 'meta
data': {}, 'intent_ranking': [{'id': -3716504755430144068, 'name': 'mon_hoc', 'confidence': 14.042401313781738}, {'id': -999397113265238847, 'name': 'cuoc_thi', 'confidence': 4.415267467498
779}, {'id': 820028608101012021, 'name': 'emotes', 'confidence': 1.392456769943237}, {'id': 1440186217318679568, 'name': 'cuu_sinh_vien', 'confidence': 0.9861510992050171}, {'id': 268768396
412061774, 'name': 'diem_ren_luyen', 'confidence': 0.43205428123474104}, {'id': 7956255788922949851, 'name': 'greet', 'confidence': 0.30536207556724504}, {'id': 4751543998330244983, 'name':
 'ignore', 'confidence': 0.10479982197284601}, {'id': 7142148979246136277, 'name': 'bye', 'confidence': 0.10226798057556101}, {'id': 6666707523814881234, 'name': 'send_email', 'confidence':
 0.008988171815872002}, {'id': -4397112386031884609, 'name': 'check_ielts', 'confidence': -0.09685418009757901}], 'response_selector': {'all_retrieval_intents': ['mon_hoc', 'xet_tuyen', 'cu
oc_thi', 'ten_chuong_trinh', 'dau_ra', 'hoc_bong', 'thay_co', 'web', 'cong_viec', 'thi_TA_dau_vao', 'tuyen_thang'], 
'default': {'response': {'id': 5232405809463388249, 'response_templates':
[{'text': 'mon_hoc/gioi_thieu'}], 'confidence': 7.79840087890625, 'intent_response_key': 'mon_hoc/gioi_thieu', 'template_name': 'utter_mon_hoc/gioi_thieu'}, 
 'ranking': [{'id': 5232405809463388249, 'confidence': 7.79840087890625, 'intent_response_key': 'mon_hoc/gioi_thieu'}, {'id': 7989132832101302198, 'confidence': 1.002089977264404, 'intent_response_key': 'cuoc_thi/thi_bong
_da'}, {'id': 1125150508630676428, 'confidence': 0.9613204598426811, 'intent_response_key': 'web/trang_khoa'}, {'id': 458494519654130423, 'confidence': 0.8487851619720451, 'intent_response_
key': 'mon_hoc/thoi_gian_tot_nghiep'}, {'id': -983025790035210076, 'confidence': 0.5153569579124451, 'intent_response_key': 'mon_hoc/yeu_cau_nganh'}, {'id': 9065120942911935007, 'confidence
': 0.507363200187683, 'intent_response_key': 'ten_chuong_trinh/cac_chuong_trinh_chuyen_nganh_cua_khoa'}, {'id': -891814048663172427, 'confidence': 0.494954466819763, 'intent_response_key': 
'web/chuc_nang_chung_cua_web'}, {'id': -5494945261469651363, 'confidence': 0.33982166647911005, 'intent_response_key': 'cong_viec/moi_truong_lam_viec'}, {'id': -6328451685157185040, 'confid
ence': 0.31952518224716103, 'intent_response_key': 'tuyen_thang/ielts'}, {'id': -2769106119634880995, 'confidence': 0.31243392825126604, 'intent_response_key': 'thi_TA_dau_vao/doi_tuong_pha
i_thi'}]}}}

This is my config:

language: vi

pipeline:
  - name: preprocesser.VietnamesePreprocesser
  - name: WhitespaceTokenizer
  - name: LanguageModelFeaturizer
    model_name: "bert"
    model_weights: bert-base-uncased
    cache_dir: null  
  - name: RegexFeaturizer
  - name: CRFEntityExtractor
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 6
  - name: DIETClassifier
    epochs: 200
    model_confidence: linear_norm
    constrain_similarities: True      
  - name: EntitySynonymMapper
  - name: ResponseSelector    
    epochs: 400
    model_confidence: linear_norm
    constrain_similarities: True            
  - name: FallbackClassifier
    threshold: 0.6
    ambiguity_threshold: 0.1

policies:
  - name: TEDPolicy
    epochs: 100
  - max_history: 3
    name: AugmentedMemoizationPolicy
  - name: RulePolicy
    core_fallback_threshold: 0.6
    core_fallback_action_name: "action_default_fallback"
    enable_fallback_prediction: True

Are there any mistakes have I made here?

This might be a bug that we had in an earlier version of Rasa. Could you share your version numbers? The easy way to retrieve those is via:

rasa --version
1 Like

This is unrelated to your bug, but I noticed that in your pipeline you’re training a model for Vietnamese while the LanguageModelFeaturizer that you’ve attached adds an English bert model. Your pipeline will likely run a lot faster (and more accurate) without it.

In case you’re interested, I am trying to add more tools for Non-English languages to Rasa via the NLU-examples repository found here. Vietnamese was one of the languages on my list for this quarter. If you have any requests for tools that should get integrated → I’m all ears. Out of curiosity, what does your VietnamesePreprocessor do? Is it a tokenizer? I was planning on adding the spaCy variant for Vietnamese.

@vapormusic Please make sure you have installed rasa>=2.3.4. Any version between 2.3.0 and 2.3.3 had a bug which may cause this. Also, if you did train your assistant with versions between 2.3.0 and 2.3.3 please re-train them with the above config after upgrading to 2.3.4. Thanks!

Thanks a lot, I have updated rasa to 2.4.1 and the problem is gone!