Invalid Rasa End-To-End Test Evaluation

Hello All, I am trying to test on my rasa model through writing test in conversation_test.md. My rasa model works perfectly fine, but the tests are not going through. My rasa version is 1.10.10. I have tried to run e2e test in two different ways. However, neither result was sufficient.

First one is the “rasa test”. The system is able to identify the number of test that I wrote. But the evaluation result shown as 0 out 20 tests. Please see the test format below.

  1. evaluation result after running “rasa test” (including log):
Processed Story Blocks: 100%|██████████████████████████████████████████| 20/20 [00:00<00:00, 1538.91it/s, # trackers=1]
2020-11-02 09:53:40 INFO     rasa.core.test  - Evaluating 20 stories
Progress:
100%|█████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 108.11it/s]
2020-11-02 09:53:41 INFO     rasa.core.test  - Finished collecting predictions.
2020-11-02 09:53:41 INFO     rasa.core.test  - Evaluation Results on END-TO-END level:
2020-11-02 09:53:41 INFO     rasa.core.test  -  Correct:          0 / 20
2020-11-02 09:53:41 INFO     rasa.core.test  -  F1-Score:         0.000
2020-11-02 09:53:41 INFO     rasa.core.test  -  Precision:        0.000
2020-11-02 09:53:41 INFO     rasa.core.test  -  Accuracy:         0.000
2020-11-02 09:53:41 INFO     rasa.core.test  -  In-data fraction: 0.975
Traceback (most recent call last):
  File "c:\users\appdata\local\programs\python\python36\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\users\appdata\local\programs\python\python36\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\AppData\Local\Programs\Python\Python36\Scripts\rasa.exe\__main__.py", line 7, in <module>
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\__main__.py", line 92, in main
    cmdline_arguments.func(cmdline_arguments)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\cli\test.py", line 159, in test
    run_core_test(args)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\cli\test.py", line 91, in run_core_test
    additional_arguments=vars(args),
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\test.py", line 140, in test_core
    rasa.core.test(stories, _agent, out_directory=output, **kwargs)
  File "c:\users\appdata\local\programs\python\python36\lib\asyncio\base_events.py", line 484, in run_until_complete
    return future.result()
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\test.py", line 556, in test
    targets, predictions = evaluation_store.serialise()
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\test.py", line 124, in serialise
    for predicted in self.entity_predictions
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\test.py", line 124, in <listcomp>
    for predicted in self.entity_predictions
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\nlu\training_data\formats\markdown.py", line 465, in generate_entity_md
    entity[ENTITY_ATTRIBUTE_START] : entity[ENTITY_ATTRIBUTE_END]
KeyError: 'start'
  1. test sample in conversation_test.md:
## happy path greetingOnly
* greeting: hello
  - utter_greeting
* askHowDoing: how are you doing 
  - utter_askHowDoing
* askWhatsPossible: Can you help me?
  - utter_askWhatsPossible

## happy path chitchatOnly
* greeting: hey
  - utter_greeting
* niceToMeetYou: good to meet you
  - utter_niceToMeetYou
* askHowDoing: how are you
  - utter_askHowDoing
* askWhatsPossible: can you help me 
  - utter_askWhatsPossible
* thanks: thank you 
  - utter_thanks
* goodbye: bye
  - utter_goodbye 

Second command that I ran is “rasa test core --e2e”

  1. result after command:
2020-11-04 09:24:53.206525: E tensorflow/stream_executor/cuda/cuda_driver.cc:351] failed call to cuInit: UNKNOWN ERROR (303)
2020-11-04 09:24:59 ERROR    rasa.core.training.dsl  - Error in line 2: Encountered invalid end-to-end format for message `greeting`. Please visit the documentation page on end-to-end testing at https://rasa.com/docs/rasa/user-guide/testing-your-assitant/#end-to-end-testing/
Traceback (most recent call last):
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 358, in process_lines
    await self.add_e2e_messages(user_messages, line_num)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 456, in add_e2e_messages
    message = e2e_reader._parse_item(m)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 59, in _parse_item
    "#end-to-end-testing/".format(line, DOCS_BASE_URL)
ValueError: Encountered invalid end-to-end format for message `greeting`. Please visit the documentation page on end-to-end testing at https://rasa.com/docs/rasa/user-guide/testing-your-assitant/#end-to-end-testing/
2020-11-04 09:24:59 ERROR    rasa.core.training.dsl  - Invalid story file format. Failed to parse 'C:\Users\AppData\Local\Temp\tmp4k7vxnr6\ca4f9944de624d0591971ae431a632b0_stories.md'
Traceback (most recent call last):
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 358, in process_lines
    await self.add_e2e_messages(user_messages, line_num)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 456, in add_e2e_messages
    message = e2e_reader._parse_item(m)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 59, in _parse_item
    "#end-to-end-testing/".format(line, DOCS_BASE_URL)
ValueError: Encountered invalid end-to-end format for message `greeting`. Please visit the documentation page on end-to-end testing at https://rasa.com/docs/rasa/user-guide/testing-your-assitant/#end-to-end-testing/

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 263, in read_from_file
    return await reader.process_lines(lines)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 371, in process_lines
    raise ValueError(msg)
ValueError: Error in line 2: Encountered invalid end-to-end format for message `greeting`. Please visit the documentation page on end-to-end testing at https://rasa.com/docs/rasa/user-guide/testing-your-assitant/#end-to-end-testing/
Traceback (most recent call last):
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 358, in process_lines
    await self.add_e2e_messages(user_messages, line_num)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 456, in add_e2e_messages
    message = e2e_reader._parse_item(m)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 59, in _parse_item
    "#end-to-end-testing/".format(line, DOCS_BASE_URL)
ValueError: Encountered invalid end-to-end format for message `greeting`. Please visit the documentation page on end-to-end testing at https://rasa.com/docs/rasa/user-guide/testing-your-assitant/#end-to-end-testing/

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\appdata\local\programs\python\python36\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\users\appdata\local\programs\python\python36\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\AppData\Local\Programs\Python\Python36\Scripts\rasa.exe\__main__.py", line 7, in <module>
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\__main__.py", line 92, in main
    cmdline_arguments.func(cmdline_arguments)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\cli\test.py", line 91, in run_core_test
    additional_arguments=vars(args),
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\test.py", line 140, in test_core
    rasa.core.test(stories, _agent, out_directory=output, **kwargs)
  File "c:\users\appdata\local\programs\python\python36\lib\asyncio\base_events.py", line 484, in run_until_complete
    return future.result()
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\test.py", line 543, in test
    completed_trackers = await _generate_trackers(stories, agent, max_stories, e2e)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\test.py", line 210, in _generate_trackers
    resource_name, agent.domain, agent.interpreter, use_e2e
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\__init__.py", line 30, in extract_story_graph
    exclusion_percentage=exclusion_percentage,
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 217, in read_from_folder
    exclusion_percentage,
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 233, in read_from_files
    f, domain, interpreter, template_variables, use_e2e
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 263, in read_from_file
    return await reader.process_lines(lines)
  File "c:\users\appdata\local\programs\python\python36\lib\site-packages\rasa\core\training\dsl.py", line 371, in process_lines
    raise ValueError(msg)
ValueError: ('Error in line 2: Encountered invalid end-to-end format for message `greeting`. Please visit the documentation page on end-to-end testing at https://rasa.com/docs/rasa/user-guide/testing-your-assitant/#end-to-end-testing/', "Invalid story file format. Failed to parse 'C:\\Users\\AppData\\Local\\Temp\\tmp4k7vxnr6\\ca4f9944de624d0591971ae431a632b0_stories.md'")
  1. sample from stories. md:
## happy path greetingOnly
* greeting
  - utter_greeting
* askHowDoing
  - utter_askHowDoing
* askWhatsPossible
  - utter_askWhatsPossible

## happy path chitchatOnly
* greeting
  - utter_greeting
* niceToMeetYou
  - utter_niceToMeetYou
* askHowDoing
  - utter_askHowDoing
* askWhatsPossible
  - utter_askWhatsPossible
* thanks
  - utter_thanks
* goodbye
  - utter_goodbye

Any help will be greatly appreciated. @erohmensing Hi Ella, I have read a lot of your response on related test issues. They are really helpful. Hoping you could take a look at my problem and provide some potential solutions.

Hi @Yuning004, the command rasa test core will by default test the stories in data/stories.md - which aren’t in E2E format. That’s why you get the second error, Encountered invalid end-to-end format for message greeting`. For the first error,

    entity[ENTITY_ATTRIBUTE_START] : entity[ENTITY_ATTRIBUTE_END]
KeyError: 'start' 

Do you have any entities in your training data? Any in your e2e stories? If you could share a minimum reproducable example (meaning, enough train & test stories + your config & domain to replicate the behaviour) I’ll take a look.

Hi @mloubser, thanks for getting back to me.

Yes, I do have entities in my training data.

  1. entities from domain.yml
entities:
  - name
  - feedback
  - sentiment

slots:

-name:

  -type: unfeaturized

-feedback:

  -type: categorical

   -values:

    - positive

    - negative
  1. stories included entities in stories.md
## positive feedback
* feedback{"feedback": "positive"}
  - slot{"feedback": "positive"}
  - utter_positiveFeedback

But I don’t have any e2e stories testing entities right now.

  1. config.yml:
language: en
pipeline:
  - name: WhitespaceTokenizer
  - name: modules.sentiment_analysis.SentimentAnalyzer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
policies:
- name: TEDPolicy
  max_history: 10
  epochs: 20
- max_history: 6
  name: AugmentedMemoizationPolicy
- name: MappingPolicy

Please let me know if you need further information. Thanks!

Just FYI I edited your message with code formatting because it makes it more readable - you can do that either by indenting a block or putting triple backticks (```) above and below the code block

Great, Thanks for fixing it! I was struggling with the format. So for the first error, it seems like the system is able to evaluate the test stories. But the correct is 0/20. I followed the e2e format for rasa. 1.10.10. I could not find any formatting issues. And there are no failed test. What could be cause of 0 correct test besides formatting?

I’m not sure about the zero correct, but your slots are not formatted correctly. Do you see an error about the domain file at training time?

Slots should look like this:

slots:
  name:
    type: unfeaturized

  feedback:
    type: categorical
    values:
      - positive
      - negative

My slot is exactly like your format. The training works fine. I was not familiar with the rasa forum formatting. It was a typo here. Sorry for the confusion.