Response selector Testing

Hello Friends, I am into testing my Rasa chatbat and I would like to not just test if a specific code message leads to my response selector but also if it leads to specifically the correct intern response selector intent. Cannot find something about that in the documentation. I know that rasa does it automatically via rasa test for the predefined ones, but I would like to define own tests.

If my faq thing looks like that:


  • text1
  • text2
  • text3


  • text4
  • text5
  • text6


  • text7
  • text8
  • text9

And I am going to type in “test7b” I would like to test that it really leads to example3.

My test file looks similar to that:

A basic end-to-end test

  • greetings.greet: hello
    • utter_greet
  • faq: test7b
    • respond_faq

where respond_faq is the overall intent of all faq. How can I check it is example3 and not just any faq?

Many greetings and thanks, Lukas

(Sorry for my stupid example, I obscured the real texts (obviously))

Not sure if I fully understand the question, but will try my best to answer :slight_smile:

In the latest Rasa Open Source version rasa test just produces a report after evaluating responses containing metrics like accuracy or f-score. In order to verify if example3 was predicted incorrect or correct you would need to have a file that contains the successful predictions or a file with the incorrect predictions. Then you could search for the example in the corresponding files. We currently do not generate those files.

I recently worked on Standardize testing output · Issue #5748 · RasaHQ/rasa · GitHub to add more information to the test output. We added the possibility to write down incorrect and correct predictions for response selection and we also generate some plots, such as the confusion matrix. I guess that is exactly what you need. The PR was merged into master. Unfortunately, we don’t have a date yet when the next version from master will be released.

Does that help?

Hey Tanja, Thank you for your answer! Well, you are right, the question was quite complicated to understand, I am sorry. I think I personally misunderstood the rasa test command. My personal understanding was that I could implement a dictionary like: “text a user types in” - “what rasa should output” to test my response selector manually. This would also include text which never was put into the intention list.

But the test command seems to just test those intents who already are inside the official list which one programmed. Which is also fine. I just thought that I could implement own test cases for test.

Great that you started to implement some more functions, I am looking forward for the next release. Many greetings, Lukas

If you want to test your response selector manually, you should be able to do so via rasa shell. Just chat with the assistant and see what he response.

Response selector just knows about those intents that are inside the training data. So if you introduce a complete new intent during testing, it will most likely fail. The test data you provide when executing rasa test can and should be different from the training data. You can add your own test cases in there. Ideally, the test data contains examples that are not inside the training data.

Ah right, I really forgot about that. Thank you very much for reminding my. I will have a look or that. Have a wonderful day!

I am wondering about the same thing. My tests only tell me that faq intent was classified as that and respond_faq action followed, but not whether it was correctly classified as specific faq question.

I am facing a problem where faqs are incorrectly classified and my tests still pass.

Is there really not a solution for that?

1 Like