How to evaluate dialogue model

When I run rasa_core.evaluate on my dialogue model I get systematic errors in which action_listen is erroneously believed to be the true label and errors in which action_listen is erroneously predicted. Why does this happen?

I was about to suggest you to open an issue with this on Github, but I saw that you already did this, which is great. We will get back to it ASAP :slight_smile:

Hi Juste, I’m glad to hear back from someone at Rasa about this. I look forward to having some feedback about my question. Thanks, Benjamin