NLU Test Set


Should we mark entities in the test set, too similar in the training set?


Ideally you create only one training set, and you do a train-test-split before training your NLU model. From my personal experience, some examples should always be in your training set and you should check that they are being classified properly (if you have a 0.1% classification error rate, that sounds perfect, but not if the sentence “yes” is not classified as the intent “affirm” or whatever you have for that purpose!). So you have to take some care into managing your data, but at the beginning this is usually not a problem.

When performing evaluation, the only thing that is checked is if the intents are classified properly. I don’t remember seeing evaluation of entity classification. You’re correct to suggest that that should be added as a feature, I never even worried about that myself! But in practice it is extremely accurate, I never had problems with it.

Thank you very much Patrick for clarification. I have separate test file. I was confused whether I should mark entities like in the training file or not. I tried both ways and when the evaluate the intents both with l marked entities and without mark entities, I got the same result. To my understanding I does not affecting the result.