Rasa Core End-To-End Evaluation

alexf388 · April 12, 2019, 6:15pm

Hi guys,

I am trying to perform end-to-end evaluation using rasa_core.evaluate. The following command I am running is this:

python -m rasa_core.evaluate default --core models/relocation/dialogue --nlu models/relocation/nlu --stories test/e2e/e2e_relocation_stories.md --endpoints endpoints.yml --e2e

I’ve made sure e2e_relocation_stories.md contains stories in an end-to-end format. For example:

## end-to-end story 3
* hello: hi
   - utter_greet
* non_resident_relocation: moving to Malaysia.
  - action_determine_if_restricted
  - slot{"is_h3" : false}
  - utter_confirm_GPE
  -  utter_is_account_personal_or_non_personal
* non_personal_account: non-personal
  - action_set_email_type
  - slot{"email_type" : "non_personal"}
  - utter_escalate_to_compliance
  - action_send_email_to_compliance
  - utter_goodbye

I’ve uploaded the exception I got in a log file instead.

erohmensing · April 23, 2019, 9:01am

Hey @alexf388, I appreciate your not wanting to muck up the screen with the stack trace, but unfortunately I can’t see your log file. Could you try to upload it again or just post the trace? Then I can help you out.

abhi_bh_nlp · February 26, 2020, 5:43pm

Hi @erohmensing i am using end to end testing framework for testing the bot. The problem is although it evaluates stories on the test data provided, it however after evaluating the stories on test data it defaults back to training NLU data and evaluate the same and in the results only output from the training data is exported for the intent and entity evaluation. I am not sure if it is intended behavior or if it shall be intended behaviour

erohmensing · March 2, 2020, 5:36pm

@abhi_bh_nlp Sounds like you’re running rasa test which runs separate core and NLU evaluations. if you only want to run the e2e ones, just do

rasa test core --e2e

abhi_bh_nlp · March 3, 2020, 8:27pm

@erohmensing No, i am running the e2e testing, i have 11 samples in test data, so the core evaluation is correct and only test data is considered, however intent evaluation shows support as 1954, which is more like my training data. Below is my code
rasa test --stories tests/validation_data/e2e_stories.md --e2e --out results/e2e

abhi_bh_nlp · March 10, 2020, 4:30am

@erohmensing Also, I realized that while testing the NLU model using the given framework, the nlu threshold is never considered as of now. Basically I got bunch of predictions in the prediction probability range of 10-20 % which is way below my current threshold of 40%.

So, although these predictions are correct but I believe we shall be able to get these examples as well in the intent report. What are your thoughts on this?

erohmensing · March 10, 2020, 9:10am

Re: your first comment, as I mentioned, rasa test runs 2 separate tests, an NLU test and a core test. The e2e test is a type of core test. If you only want to run that one, you should run

rasa test core --stories tests/validation_data/e2e_stories.md --e2e --out results/e2e

The intent report does not take the e2e stories into account, as it is the same as running rasa test nlu after running the stories command above.

Topic		Replies	Views
Evaluate stories against a model not working as expected Rasa Open Source	4	1169	October 4, 2019
End to end testing runs nlu test on training data Rasa Open Source	4	512	August 18, 2020
Invalid Rasa End-To-End Test Evaluation Rasa Open Source	7	840	December 26, 2022
Help with e2e evalation Getting Started with Rasa	4	157	January 21, 2019
What is the standard practice for Rasa Testing? Rasa Open Source	4	469	September 24, 2020

Rasa Core End-To-End Evaluation

Related topics