Retrieval intent returns with low confidence even testing with a sample in training data

nguduong · March 29, 2024, 8:47am

Hi all,

I’m using Rasa to create a Chatbot to reply FAQ to customer by following this guide: Chitchat and FAQs

I’ve trained the model successfully with ~300 FAQs, but when testing NLU I always get the intent with very low confidence (10% → 20%) even when I try with a sample in the training data. Also, sometimes, it returns wrong intent for sample in the training data.

My system:

Rasa Version      :         3.6.19
Minimum Compatible Version: 3.5.0
Rasa SDK Version  :         3.6.2
Python Version    :         3.10.12
Operating System  :         Linux-5.15.133.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python Path       :         /opt/venv/bin/python

Files: config.yml (1.2 KB) domain.yml (1.9 KB) nlu.yml (338.1 KB)

Could you tell me what I need to do to increase the accurate? And, my FAQ set contains ~3000 FAQs, can Rasa handle well in my case?

Thank you for your help.

stephens · March 30, 2024, 2:38am

I would read about NLU testing and intent confusion in particular. There’s docs on generating a confusion matrix here.

There’s a blog post on testing here.

nguduong · March 30, 2024, 8:55am

Thank you for your reply.

Here are files generated after following the doc.

DIETClassifier_report.json (2.0 KB)

DIETClassifier_errors.json (7.6 KB) intent_errors.json (170 Bytes) intent_report.json (580 Bytes) response_selection_errors.json (66.4 KB) response_selection_report.json (55.2 KB)

It seems that it confused a lots, but I don’t know how to improve the result. Could you please give me some advices?

Thank you for your help.

stephens · March 31, 2024, 2:10am

The response_selection_errors.json shows the problem. For example, the following shows that the two intents are easily confused based on the intent titles. I would expect that these two intents could be confused since they are so similar. You could

combine them and answer both questions in the response
try to separate them more clearly by providing clearer separation in the example utterances
switch to a RAG approach like Rasa Pro’s enterprise search.

  {
    "text": "chỉ định da của chị phải căng chỉ và tiêm botox mới cải thiện, làm nhiều dịch vụ như thế mặt chị có đơ không em?",
    "intent_response_key_target": "faq/ask_cang_da_bang_chi_mat_co_bi_do_khong_do_tuoi_de_cang_da_chi_la_bao_nhieu_",
    "intent_response_key_prediction": {
      "name": "faq/ask_cang_da_mat_bang_chi_co_gay_nguy_hiem_khong_",
      "confidence": 0.08167824149131775
    }
  },

nguduong · April 1, 2024, 3:48am

Thank you for your advices.

I will rework on the NLU data to make it clearer and try again.

For the second question, my NLU data will contain ~3K FAQs like that in the end. Could Rasa handle well in this case?

stephens · April 1, 2024, 2:58pm

With that many FAQ’s I would use a RAG approach.

nguduong · April 3, 2024, 2:45am

Thank you again.

I will try RAG approach also after cleaning the current data to see if it’s getting better.

Topic		Replies	Views
Intent incorrectly recognised with high confidence Rasa Open Source	3	271	June 20, 2023
How can we improve confidence score of intents Rasa Open Source	7	4678	October 15, 2018
Random input - intent classified with high confidence Rasa Open Source	5	644	December 22, 2020
Wrong Confidence Rasa Open Source	1	309	February 26, 2020
As number of intents increases, confidence level decreases Rasa Open Source	7	2090	August 24, 2018

Retrieval intent returns with low confidence even testing with a sample in training data

Related topics