The 'testing stories' output

I’m pretty new to all of this and have been building a bot that is working well. I currently test it from the shell, but frankly I’m tired of interacting with it all the time.

I found this in the docs Evaluating Models

Is there any anything that will help explain the output I see when I run

rasa test --stories test-conversations\stories-set-01.md --e2e

I have no idea what to look at to see if it is working. I was hoping for something like PHPUnit that gives me a reassuring green check or something LOL.

I see “Your model made no errors” but I also see some scores for END-TO-END and some for ACTION level, matrices, tables and averages.

Is there one section in the output that relates to each block in a story? or is this output a general summary of the entire story?

Thanks for any pointers.

output relates to all the stories you provided for testing and shows average accuracies. Failures can originate from two sources: nlu (intent classification correctness) and core (action prediction correctness). Useful thing to look at is results/failed_stories.md file that explicitly shows mistakes, your bot made

1 Like

Thanks, I’ll take a look at the results directory (didn’t see it there until you mentioned it). As far as a “this is this, that is that” for the screen dump, I’m guessing it’ll require a bit more understanding if that data.

For the record, I found Botium (https://www.botium.at/) which is pretty cool and does more of what I guess I was hoping for - automated tests. There’s some duplication of content depending on how in depth you want your tests, but I’m working on some tooling to generate those test scripts from the nlu.md file :slight_smile: I’m really hoping to get something running to test and make sure I didn’t break the bot as I add more and more functionality. More of a sanity check than a learning/dialog check.

In a nutshell, you create story scripts much like e2e config above, and it’ll interact with the bot very quickly giving a green or red pass/fail for each one, which is what I’m accustomed to.

I DO like the results directory from the Rasa e2e run. That’s super helpful. I can see using both of these as I move forward. :+1: