Adding Larger Lookup Table Causes Ill Defined F-Scores

I have an entity that is supposed to be extracted by ner_crf. Earlier I had about 50-60 examples in the lookup table for this entity. Everything seemed to be working as expected.

Now, I have added about 1000 entries in my lookup table. Suddenly, intent_classifier_sklearn component gives following warning: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.

What is happening here? Shouldn’t intent classifier work independently of the lookup table - which is entity extraction construct?

I always have this warning during the training. Also can’t figure it out…

Can you share the data and the logs? I suspect the either one of the intents have too little information or it is too diverse. Have you tried calculating the confusion matrix using evaluate module?

While these methods aren’t concrete, they should give you rough idea of where the problem lies.

Sorry, it is a business project so I can’t share any data

The ill defined f-score is based on lack of enough training examples to evaluate a certain intent or entities, when you are using lookup tables, it uses one of the features of ner_crf , which then enforces training and tries to evaluate based on the training set, however we don’t put all the examples in the training set and hence could be the reason for this warning

Isn’t the very purpose of a lookup table that we shouldn’t have to include all the examples in training set? Secondly, I know it is just a warning, but it looks like it might affect predictions. Is there a guideline on how training should look to avoid it?

Indeed, but the lookup table is using one of the feature of ner_crf - pattern feature

take a look at this blog - Entity extraction with the new lookup table feature in Rasa NLU | The Rasa Blog | Rasa

However, because the training set is still so small, you’d likely need a few hundred more examples to push this score to above 80% in practice.

One of the sentences from the blog

Lookup tables are means to improve entity extraction of NER_CRF unlike a phrase matcher. It depends on the entity itself

I’m sorry, I am a bit confused. Here’s what I understand:

Lookup tables use pattern feature from ner_crf. So, for entity extraction to work properly, we need enough (not all, but enough) data in the intent itself. So, essentially, if I add more values from lookup table to my training samples, performance should improve.

Is that correct?