Rasa test is predicting differently than shell?

I’m having a problem writing tests for rasa 3.4.1. This is a new test, and didn’t exist before I upgraded.

When I train and run rasa shell I can run through a scenario and it works as I expect.

When I run rasa test my test story fails for that scenario and is predicting the fallback scenario. Am I doing something wrong? All my code is below with a comment afterwards.

Has anyone else had an issue with testing?

Here’s the original story with the checkpoint stories:

  - story: Get New ticket counts
    steps:
      - intent: crm_get_all_by_status
      - action: get_corporate_id_form
      - active_loop: get_corporate_id_form
      - action: action_get_all_tickets_by_status
      - action: utter_ask_to_view_tickets
      - checkpoint: show_or_not_show_tickets

Here’s the checkpoint it calls

  - story: No, do not show Tickets
    steps:
      - checkpoint: show_or_not_show_tickets
      - intent: deny
      - action: utter_ask_anything_else
      - checkpoint: ask_anything_else

Here’s the checkpoint that calls:

  - story: Ask if there is anything else - No
    steps:
      - checkpoint: ask_anything_else
      - intent: deny
      - action: action_save_event_chat_initiated
      - action: utter_thanks_goodbye

My Test Story:


- story: TEST Get and display number of tickets User says No
  steps:
  - user: |
      How many tickets do I have?
  - intent: crm_get_all_by_status
  - action: get_corporate_id_form
  - active_loop: get_corporate_id_form
  - active_loop: null
  - action: action_get_all_tickets_by_status
  - action: utter_tell_number_of_tickets
  - action: utter_ask_to_view_tickets
  - user: |
      No
  - checkpoint: show_or_not_show_tickets
  - intent: deny
  - action: utter_ask_anything_else
  - checkpoint: ask_anything_else
  - user: |
      No thanks
  - intent: deny
  - action: utter_thanks_goodbye

And finally here is the failed_test_stories.yml -

version: "3.1"
stories:
- story: TEST Get and display number of tickets User says No (/home/jwheat/Code/NearlyHuman/rasa/rasa-demo/tests/test_stories.yml)
  steps:
  - user: |-
      How many tickets do I have?
  - action: action_listen  # predicted: action_default_fallback
  - intent: crm_get_all_by_status
  - action: get_corporate_id_form
  - active_loop: get_corporate_id_form
  - active_loop: null
  - action: action_get_all_tickets_by_status  # predicted: utter_tell_inform_thank_you
  - action: utter_tell_number_of_tickets  # predicted: utter_ask_to_view_tickets
  - action: utter_ask_to_view_tickets  # predicted: action_save_event_live_agent_chat_failed
  - user: |-
      No thanks
  - action: action_listen  # predicted: action_default_fallback
  - intent: deny
  - action: utter_ask_anything_else  # predicted: action_save_event_chat_initiated
  - user: |-
      No thanks
  - action: action_listen  # predicted: action_default_fallback
  - intent: deny
  - action: utter_thanks_goodbye  # predicted: action_save_event_chat_initiated

You can see right after the user line it has added an extra - action: action_listen yet comments as it with # predicted: action_default_fallback

Rasa test and shell prediction differences could be due to several factors, including differences in the training data, models, or configurations used. It is important to check the input and context in both Rasa test and shell to ensure that the same information is being used for prediction. Additionally, checking the versions of Rasa and other dependencies being used could also be helpful in identifying the source of the discrepancy. If the issue persists, it may be necessary to further debug and fine-tune the models to improve their accuracy.