Adding Larger Lookup Table Causes Ill Defined F-Scores

akshay2000 · May 8, 2019, 5:16pm

I have an entity that is supposed to be extracted by ner_crf. Earlier I had about 50-60 examples in the lookup table for this entity. Everything seemed to be working as expected.

Now, I have added about 1000 entries in my lookup table. Suddenly, intent_classifier_sklearn component gives following warning: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.

What is happening here? Shouldn’t intent classifier work independently of the lookup table - which is entity extraction construct?

damao · May 9, 2019, 6:20am

I always have this warning during the training. Also can’t figure it out…

akshay2000 · May 13, 2019, 8:18am

Can you share the data and the logs? I suspect the either one of the intents have too little information or it is too diverse. Have you tried calculating the confusion matrix using evaluate module?

While these methods aren’t concrete, they should give you rough idea of where the problem lies.

damao · May 14, 2019, 1:40am

Sorry, it is a business project so I can’t share any data

souvikg10 · May 17, 2019, 9:36pm

The ill defined f-score is based on lack of enough training examples to evaluate a certain intent or entities, when you are using lookup tables, it uses one of the features of ner_crf , which then enforces training and tries to evaluate based on the training set, however we don’t put all the examples in the training set and hence could be the reason for this warning

akshay2000 · May 18, 2019, 5:47am

Isn’t the very purpose of a lookup table that we shouldn’t have to include all the examples in training set? Secondly, I know it is just a warning, but it looks like it might affect predictions. Is there a guideline on how training should look to avoid it?

souvikg10 · May 18, 2019, 7:43am

Indeed, but the lookup table is using one of the feature of ner_crf - pattern feature

take a look at this blog - Entity extraction with the new lookup table feature in Rasa NLU | The Rasa Blog | Rasa

However, because the training set is still so small, you’d likely need a few hundred more examples to push this score to above 80% in practice.

One of the sentences from the blog

Lookup tables are means to improve entity extraction of NER_CRF unlike a phrase matcher. It depends on the entity itself

akshay2000 · May 19, 2019, 6:48am

I’m sorry, I am a bit confused. Here’s what I understand:

Lookup tables use pattern feature from ner_crf. So, for entity extraction to work properly, we need enough (not all, but enough) data in the intent itself. So, essentially, if I add more values from lookup table to my training samples, performance should improve.

Is that correct?

Topic		Replies	Views
UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples. Rasa Open Source	2	1348	August 29, 2019
Lookup table is supposed to classify entities, but does it influence intent prediction? Rasa Open Source	4	1165	April 15, 2021
Look up tables are not working on unseen data samples Rasa Open Source	2	598	May 27, 2019
Is the lookup table working? Rasa Open Source	4	1356	July 25, 2019
Error while trying to create a look up table Rasa Open Source	18	2710	August 23, 2019

Adding Larger Lookup Table Causes Ill Defined F-Scores

Related topics