Rasa test nlu: test if all entities are labeled chorrectly within a sentence

Hello everyone,

I want to try out different approaches in labeling entities within the same intent. For this I want to create 2 nlu files, one per approach, train two models and then compare the performance of them. The first approach uses a lot of entities, the second one tries to summarize the needed data info few entities.

I saw that with rasa test nlu you can get some nice confusion matrices, to analyse which entities are confused with each other. However I need to know if the whole sentence with multiple entities is labeled correctly. If yes, this sentence should be counted as successful and in the end I have a result that may look like this: “597 out of 1000 sentences are labeled with the correct entities.”

Is there a way to do this in Rasa? Or is there an easy way to accomplish this result? Thank you for any suggestions! :slight_smile:

Hi there!

Maybe some suggestions

  • You could “re-process” the .json files generated by the rasa test nlu yourself in a custom script (by combining the DIETClassifier_report.json & DIETClassifier_errors.json you could extract some metrics). However in your case, you have multiple entities in 1 sentence and this info is lost in the json files unfortunately.
  • A second option would be to write a “test script” yourself, where you load your trained model and parse all your test utterances. You will get a very detailed results per sentence (ranking all predicted intents and the confidence, but also the predicted entities (if any)). If you do some post-processing on these results, I think you can get the insights you are referring to. If you want an example of this, rasa has a very good blog about how to write some custom python code to evaluate models → Evaluating Rasa NLU Models in Jupyter

Good luck! :slight_smile:

Thank you! :slight_smile: I was already thinking about writing my own test script. That Jupyter link ist really helpful, my initial approach would have been quite laborious.

1 Like