No_entity prediction

Dear all,

I have a question about no_entity predictions, please. In the attached DIETClassifier confusion matrix, I am wondering why the number of predicted no_entity labels is so high? One can see that 740 samples are predicted as no_entity. I don’t have that number of samples in my nlu.yml, most of my utterances contains entities. I have about 100 utterances without any entity. Any idea please?

Best regards,

Hi @ali_ch, the entities are evaluated on a per-token basis, so the high number of no_entity tags there comes from the number of words that are not labelled with an entity (not the number of utterances). This can be found in the code here where the confusion matrix is computed in the _calculate_report function that gets as input a list of merged predictions and targets. In contrast, intent evaluation works per utterance, so the numbers there should be much lower.

Okay, thank you for your time !