The 'testing stories' output

jonathanpwheat · September 16, 2019, 7:55pm

I’m pretty new to all of this and have been building a bot that is working well. I currently test it from the shell, but frankly I’m tired of interacting with it all the time.

I found this in the docs Evaluating Models

Is there any anything that will help explain the output I see when I run

rasa test --stories test-conversations\stories-set-01.md --e2e

I have no idea what to look at to see if it is working. I was hoping for something like PHPUnit that gives me a reassuring green check or something LOL.

I see “Your model made no errors” but I also see some scores for END-TO-END and some for ACTION level, matrices, tables and averages.

Is there one section in the output that relates to each block in a story? or is this output a general summary of the entire story?

Thanks for any pointers.

Ghostvv · September 17, 2019, 11:55am

output relates to all the stories you provided for testing and shows average accuracies. Failures can originate from two sources: nlu (intent classification correctness) and core (action prediction correctness). Useful thing to look at is results/failed_stories.md file that explicitly shows mistakes, your bot made

jonathanpwheat · September 17, 2019, 1:55pm

Thanks, I’ll take a look at the results directory (didn’t see it there until you mentioned it). As far as a “this is this, that is that” for the screen dump, I’m guessing it’ll require a bit more understanding if that data.

For the record, I found Botium (https://www.botium.at/) which is pretty cool and does more of what I guess I was hoping for - automated tests. There’s some duplication of content depending on how in depth you want your tests, but I’m working on some tooling to generate those test scripts from the nlu.md file I’m really hoping to get something running to test and make sure I didn’t break the bot as I add more and more functionality. More of a sanity check than a learning/dialog check.

In a nutshell, you create story scripts much like e2e config above, and it’ll interact with the bot very quickly giving a green or red pass/fail for each one, which is what I’m accustomed to.

I DO like the results directory from the Rasa e2e run. That’s super helpful. I can see using both of these as I move forward.

Topic		Replies	Views
Rasa test vs. rasa test core Rasa Open Source testing	3	1244	November 7, 2021
Tests not finishing Rasa Open Source	8	1679	January 20, 2021
End to end story testing; how to read results of failed_test_stories.yml Collaboration & Testing Requests rasa-stories , rasa	1	662	June 11, 2023
How to handle failed_stories in Evaluating core models Rasa Open Source	3	1192	November 8, 2019
Rasa Core End-To-End Evaluation Rasa Open Source	6	2567	March 10, 2020

The 'testing stories' output

Related topics