Hello Friends,
I am into testing my Rasa chatbat and I would like to not just test if a specific code message leads to my response selector but also if it leads to specifically the correct intern response selector intent. Cannot find something about that in the documentation. I know that rasa does it automatically via rasa test for the predefined ones, but I would like to define own tests.
If my faq thing looks like that:
intent:faq/example1
text1
text2
text3
intent:faq/example2
text4
text5
text6
intent:faq/example3
text7
text8
text9
And I am going to type in “test7b” I would like to test that it really leads to example3.
My test file looks similar to that:
A basic end-to-end test
greetings.greet: hello
utter_greet
faq: test7b
respond_faq
where respond_faq is the overall intent of all faq.
How can I check it is example3 and not just any faq?
Many greetings and thanks,
Lukas
(Sorry for my stupid example, I obscured the real texts (obviously))
Not sure if I fully understand the question, but will try my best to answer
In the latest Rasa Open Source version rasa test just produces a report after evaluating responses containing metrics like accuracy or f-score. In order to verify if example3 was predicted incorrect or correct you would need to have a file that contains the successful predictions or a file with the incorrect predictions. Then you could search for the example in the corresponding files. We currently do not generate those files.
I recently worked on Standardize testing output · Issue #5748 · RasaHQ/rasa · GitHub to add more information to the test output. We added the possibility to write down incorrect and correct predictions for response selection and we also generate some plots, such as the confusion matrix. I guess that is exactly what you need. The PR was merged into master. Unfortunately, we don’t have a date yet when the next version from master will be released.
Hey Tanja, Thank you for your answer! Well, you are right, the question was quite complicated to understand, I am sorry.
I think I personally misunderstood the rasa test command. My personal understanding was that I could implement a dictionary like:
“text a user types in” - “what rasa should output” to test my response selector manually. This would also include text which never was put into the intention list.
But the test command seems to just test those intents who already are inside the official list which one programmed. Which is also fine. I just thought that I could implement own test cases for test.
Great that you started to implement some more functions, I am looking forward for the next release.
Many greetings,
Lukas
If you want to test your response selector manually, you should be able to do so via rasa shell. Just chat with the assistant and see what he response.
Response selector just knows about those intents that are inside the training data. So if you introduce a complete new intent during testing, it will most likely fail.
The test data you provide when executing rasa test can and should be different from the training data. You can add your own test cases in there. Ideally, the test data contains examples that are not inside the training data.
I am wondering about the same thing. My tests only tell me that faq intent was classified as that and respond_faq action followed, but not whether it was correctly classified as specific faq question.
I am facing a problem where faqs are incorrectly classified and my tests still pass.