Confidence Mismatch

abhishakskilrock · August 24, 2018, 6:17am

Hi everyone when I am training my model on one system it shows UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples. but when I train it on other system it is not showing this type of warning also when I checked it by sending some chat through curl command there is huge difference in confidence on both the system. Do anyone know why this happen and how do I modify them?

##command given for both the system are same.

akelad · August 24, 2018, 12:34pm

Are you sure you’ve trained on the same data? This usually means you don’t have enough training data

abhishakskilrock · August 27, 2018, 4:38am

Yes I know that It means training data is insufficient but I trained the same data so second system should also show the same warning. Apart from that after training both model accuracy is different.

akelad · August 27, 2018, 10:32am

What kind of confidence difference is there? Maybe post your NLU data here, I’d guess you probably just don’t have enough examples to have a consistent prediction

abhishakskilrock · August 27, 2018, 11:13am

I am uploading number of training examples in each intent. In rasa nlu documentation they mentioned approx 20 training example would be sufficient but I have used much more than that. After training this data in a system having i3 processor 8 GB RAM I got 19.xx% confidence and in the system which have i7 processor, 8GB RAM, Intel XEON graphic card I got 37.xx% confidence on the same test example.

akelad · August 29, 2018, 1:51pm

Ok how many different intents is that? That looks like a very large amount, the 20 examples per intent recommendation is just a starting point for fewer number of intents. Also if there’s overlap in your training data, the tensorflow_embedding pipeline might work better

abhishakskilrock · August 30, 2018, 6:00am

There are total 86 intents.

akelad · August 31, 2018, 2:42pm

Yeah you need more examples for this I would say. Also these confidence values, are you getting them from running the evaluation script or how?

abhishakskilrock · September 4, 2018, 4:44am

I ran the command of rasa-nlu first on both the server:

python -m rasa_nlu.server -c config_spacy.json --path models/ -P 5050

and then I ran the command:

curl -XPOST localhost:5050/parse -d '{"q":"I want to purchase ticket"}'

which gives me difference in confidence on both the server.

akelad · September 4, 2018, 9:35am

Ok since you’ve trained this with spacy, that is to be expected. I assume there’s a lot of overlap in your intents, which spacy can’t handle very well. I’d suggest switching to the tensorflow_embedding pipeline. And then running the evaluation script on the model to see how well it does.

abhishakskilrock · September 4, 2018, 10:46am

Yeah, Thanx @akelad it helped.

Topic		Replies	Views
Same training data in different projects give different confidence scores Rasa Open Source	3	569	February 26, 2019
As number of intents increases, confidence level decreases Rasa Open Source	7	2127	August 24, 2018
Confidence score Different between AWS instance and Local system Rasa Open Source	6	556	October 24, 2019
How can we improve confidence score of intents Rasa Open Source	7	4696	October 15, 2018
Rasa NLU without Rasa Core Getting Started with Rasa confidence	4	200	August 23, 2019

Confidence Mismatch

Related topics