Snapshot-based testing with Rasa?

Hello!

The team I am on is trying to come up with a test strategy for the Rasa bots we’re building, more on the side of the flow of stories than the NLU. What we think we’d like to do is use a cli bot to run through a story and generate snapshots of the flow for testing purposes (across both rasa-stack & action-server), and then validate the content of the snapshots against our expectations of the system’s behaviour.

We wouldn’t want to mock the action-server because we’re using form actions that dynamically modify the flow of conversation and want to test as many of these flows as we can.

Given that, we have the following questions:

  1. Is this something Rasa is equipped to do? 1b) If so, how?
  2. Even if it is, is this a good idea for testing flows in the first place?
3 Likes

hi @mikes ! welcome to the forum :slight_smile:

does this do what you want? Evaluating Models

  • if you want to skip NLU evaluation, you can provide NLU input as /intent{entities}
  • actions aren’t executed, so slots and other events have to be in the stories
1 Like

Hi @amn41!

Sort of. My concern is that if actions aren’t executed, then we run into the same constraint as mocking the action-server.

What if we wanted to test the flow with actions executed?

Hi @mikes - we decided against executing actual actions because they can have side effects, which means the success/failure of your tests becomes highly dependent on the outside world.

You could set up integration tests to do this, but you’d have to carefully ensure the environment is declarative and you don’t have e.g. old data in databases.

Hi again, @amn41!

I was doing some proof of concept on getting e2e stories running on the tutorial Rasa moodbot, and it is looking mostly promising for regression testing purposes, though I have a couple of general questions based on things that aren’t clear from the documentation you linked:

  1. If I want to test a more complex flow that uses forms/form actions, how would I write an end-to-end story to run through that, or indicate requisite slot values in said e2e story such that it will run (given that, like you mentioned, actions aren’t executed)?

  2. Similarly, how would I write an end-to-end story to ensure that a flow featuring a custom action will run as expected?

There’s also a separate question our team has come up with around testing policies in isolation and whether rasa test is equipped to do that.

hi @mikes

  1. the least error-prone way is to generate the story using interactive learning - that way you know it’ll match up exactly with what rasa core will see.
  2. you cannot guarantee the action will run of course (bc that will often depend on external services), but you can guarantee that rasa core will handle the output correctly provided it did run. The slot and form events in the stories will relay the info of what happened to core