NLU Test Set

serdar · April 15, 2019, 2:56pm

Hello,

Should we mark entities in the test set, too similar in the training set?

Regards

PatrickDS · April 15, 2019, 5:37pm

Ideally you create only one training set, and you do a train-test-split before training your NLU model. From my personal experience, some examples should always be in your training set and you should check that they are being classified properly (if you have a 0.1% classification error rate, that sounds perfect, but not if the sentence “yes” is not classified as the intent “affirm” or whatever you have for that purpose!). So you have to take some care into managing your data, but at the beginning this is usually not a problem.

When performing evaluation, the only thing that is checked is if the intents are classified properly. I don’t remember seeing evaluation of entity classification. You’re correct to suggest that that should be added as a feature, I never even worried about that myself! But in practice it is extremely accurate, I never had problems with it.

serdar · April 15, 2019, 5:51pm

Thank you very much Patrick for clarification. I have separate test file. I was confused whether I should mark entities like in the training file or not. I tried both ways and when the evaluate the intents both with l marked entities and without mark entities, I got the same result. To my understanding I does not affecting the result.

regards

Topic		Replies	Views
Should I annotate all examples in my dataset? Tutorials, Resources & Videos	1	384	July 2, 2021
Rasa test nlu: test if all entities are labeled chorrectly within a sentence Rasa Open Source testing	2	722	February 15, 2021
Correct entity extraction but misclassified intents Rasa Open Source	4	834	April 25, 2019
Does each of the sentence must have the entity to train? Rasa Open Source	1	470	September 4, 2018
Entities and intent classification Rasa Open Source	3	767	December 18, 2019

NLU Test Set

Related Topics