Failed stories

I really hope someone answers my question this time.

Why do I see some stories reported as failed stories in failed_test_stories.yml file when they actually didn’t fail. Testing those same intents with rasa shell showed they did not fail. For example;

- story: show the covid testing centers (./tests/test_stories.yml)
  steps:
  - intent: testing_centers
    entities:
    - disease: covid
  - action: action_show_testing_centers

usually failed intents come with information on what went wrong. e.g:

- intent: sendoff  # predicted: nlu_fallback: bye

can someone please explain?

hi @laboratory in a nutshell, rasa predicted a different intent or action than how you expected the conversation to flow - a failed story means that the steps in the test conversation you designed didn’t match the steps predicted by rasa. Also, one thing to bear in mind is that the custom actions won’t execute during testing, so best to check if you have a test conversation flow relying on the execution of a custom action. This section in the docs could be helpful too in case you haven’t checked it yet.

Are you able to share your stories and rules files with me as well as the test_stories.yml file to help you further?

Thanks, @anca for your response.

  1. Your opening statement mentioned,

“rasa predicted a different intent or action than how you expected the conversation to flow”

I understand this and that’s why I think that particular intent or story is additionally tagged with Rasa’s prediction like so # predicted: nlu_fallback: bye

a failed story means that the steps in the test conversation you designed didn’t match the steps predicted by rasa.

Are you referring to the step as a whole or each intent and action in a step (like point 1 above)?

  1. please find both stories and rules files

test_stories.yml (5.0 KB) rules.yml (862 Bytes) stories.yml (2.2 KB)

Are you referring to the step as a whole or each intent and action in a step (like point 1 above)?

@laboratory By step I meant intent or action step.

I had a quick look at the files you provided - most of the stories in stories.yml seem to be rules actually, you should design a story that is like a conversation (with beginning, middle, end queries) similar to how you expect an end user to interact with the assistant. So one step would be to restructure stories.yml and move most of these small existing stories to rules.yml. Another advice I have is to break down some of your test stories, because some seem to be 2 or more stories in the test story.

For example:

- story: check and display lockdown information
  steps:
  - user: |
      are we under lockdown?
    intent: lockdown_areas
  - action: action_check_lockdown
  - user: |
      is [musanze]{"entity":"location"} currently experiencing lockdown?
    intent: lockdown_areas
  - action: action_check_lockdown

Should be actually 2 story examples.

Hope this helps - I can check again on Mon how you’re progressing :rocket:

@anca .Yes this works well. I will look into story restructuring (probably with rasa interactive)

one thing to bear in mind is that the custom actions won’t execute during testing

looking at my files has this affected testing in a way or the way some of the (passed) test cases still showing up in failes_stories.yml (the reason for this post)?

I can check again on Mon how you’re progressing :rocket:

Please do. I will aprreciate

has this affected testing in a way or the way some of the (passed) test cases still showing up in failes_stories.yml

It is possible if the outcome of your custom action influences the direction of the conversation, for example setting a SlotSet event etc.

@laboratory have you managed to progress otherwise with your test stories?

Additionally, the point raised by @nik202 with regards to making sure your nlu examples should be double-checked and improved is also valid - rasa might not be able to generalise the user message in your test stories and predict the intent you expected.

have you managed to progress otherwise with your test stories?

@anca I am afraid I haven’t made progress with my test stories. The only thing I did was splitting the ones that were 2 or more into individual stories (as you recommended).

you should design a story that is like a conversation (with beginning, middle, end queries) similar to how you expect an end user to interact with the assistant.

As regards my stories.yml, I tried designing like an actual conversation (with the help of rasa interactive) however, it messed up my bot performance seriousyl so I reverted to the rule-like format as you noticed initially.

It is possible if the outcome of your custom action influences the direction of the conversation, for example setting a SlotSet event etc.

For now, I haven’t used slots in my development. The only way I think custom actions might have caused this is are the if…else clauses (if I am right). if this entity, do this and dispatch this, else do that and dispatch that. My worry is it also happens for intents that don’t require custom actions at all.

rasa might not be able to generalise the user message in your test stories and predict the intent you expected.

I believe these are for cases where the model predicted nlu_fallbacks. But the point of this post is around those intents that weren’t predicted as fallbacks but still showing up as failed tests.

Please see my failed_test stories.yml file. failed_test_stories.yml (2.1 KB) You would observe that only intent vaccination_requirement predicted fallback but others showed up as well.

Thank you

@anca wondering if you saw the above message.

@laboratory are you able to send me a link to your assistant repository? That would enable me to check the latest versions of your files because it’s a bit difficult to interpret failed_test_stories.yml without looking at the last version of the rest of the files.

it messed up my bot performance seriousyl so I reverted to the rule-like format as you noticed initially

Please provide more details as what exactly happened, the debug log for errors etc.

I still strongly advise to move into rules.yml any sets of dialogue steps that should always behave like that (a certain action that should always follow a certain intent etc).

Your test stories should ideally test happy conversation paths as well as unhappy paths, as in the documentation link I originally shared in my first reply.

are you able to send me a link to your assistant repository?

@anca That would be great. But I’m afraid I can’t share a link here since it would be accessed by the public. It’s a company project. Any suggestion on how to get a link across is highly welcomed.

Please provide more details as what exactly happened, the debug log for errors etc.

What I meant was that different and unrelated actions/response were shown for intents when I type in a question. I assume I don’t yet have a good knowledge or grasp of good story writing

I’m afraid I can’t share a link

No worries, I completely understand.

Since it’s a bit tricky then, I suggest running rasa x to generate stories - that could be helpful. You could even go further with sharing your bot with guest testers that could help you generate more stories.

different and unrelated actions/response were shown for intents when I type in a question.

Did this happen even though you’ve transferred those original short story snippets to rules.yml?

@anca Currently I have left them still in stories.yml… I know they may be rule-like and maybe acting like rules. I was wondering if there’s a difference between having them in stories.yml or in rules.yml. Didn’t think there was any. So I thought, when I eventually generate (proper) stories, I’ll move them.

Is this assumption correct? What could be the downsides (if any)?

@laboratory not moving those snippets into rules.yml is highly likely the reason why your bot became unstable.

I’ll reference the documentation on the difference between stories and rules, because they should be used together for best results:

Stories are used to train a machine learning model to identify patterns in conversations and generalize to unseen conversation paths. 
Rules describe small pieces of conversations that should always follow the same path and are used to train the RulePolicy.

You could also follow this advice in the docs on how to write conversation training data.

Thanks @anca

I studied the rasa docs all night and I have a better understanding on your reply above and it makes much sense.

I think the only thing I am battling with is story conflict/resolution (or conflict between a particular story and rule). I am wondering if there’s a resource on that.

I wrote stories for lockdown information based on the different conversation path I think as follows:

stories:
- story: show if a location is under lockdown or not
  steps:
  - intent: lockdown_areas
  - action: action_check_lockdown

- story: location + ask location + specify location
  steps:
  - intent: lockdown_areas
  - action: action_check_lockdown
  - action: utter_ask_location_lockdown
  - intent: specify_location_lockdown
  - action: action_check_lockdown

- story: location + location not found + specify location
  steps:
  - intent: lockdown_areas
  - action: action_check_lockdown
  - action: utter_location_not_found
  - intent: specify_location_lockdown
  - action: action_check_lockdown

However, when I run rasa data validate, I get the below warning:

Story structure conflict after action 'action_check_lockdown':
  utter_ask_location_lockdown predicted in 'location + ask location + specify location'
  utter_location_not_found predicted in 'location + location not found + specify location'
  action_listen predicted in 'show if a location is under lockdown or not'

When I removed story 3 completely and removed utter_ask_location_lockdown in story 2. The error went.

Can you help me understand this?

@anca Also, while the conversation in this thread has been useful, I still meet my initial challenge. Stories still shows up in failed_test_stories.yml and showing no comment on what was predicted by the model instead. Meaning the stories were correct (upon confirmation with rasa shell and shouldn’t be there.

I am wondering what it takes to solve this.

@laboratory what does action_check_lockdown do? Look-up a location entity in a database and set a slot (let’s say location?) I think based on a raw understanding of your stories, you could try a few options:

stories:
- story: show if a location is under lockdown or not
  steps:
  - intent: lockdown_areas
  - action: action_check_lockdown
  - checkpoint: lockdown_checkpoint

- story: location + ask location + specify location
  steps:
  - checkpoint: lockdown_checkpoint
  - slot_was_set:
        - location: test_location
  - action: utter_ask_location_lockdown
  - intent: specify_location_lockdown
  - action: action_check_lockdown

- story: location + location not found + specify location
  steps:
  - checkpoint: lockdown_checkpoint
  - slot_was_set: 
      - location: null
  - action: utter_location_not_found
  - intent: specify_location_lockdown
  - action: action_check_lockdown
  • you could also have your custom action dispatch different messages and this way you can remove the 2nd action utter_ and probably you’d have one single story in this case:
- story: handle lockdown check path
  steps:
  - intent: lockdown_areas
  - action: action_check_lockdown
  - intent: specify_location_lockdown
  - action: action_check_lockdown

It’s a matter of experimenting, I’m a bit uncertain myself if the first option would work only because I don’t know the code in your custom action.

what it takes to solve this.

This entire thread on designing stories and rules should help; your test stories are failing because rasa predicts different actions for intents and fixing your conversation training data would definitely help this.

hi @anca , this is dfferent question, but can you please answer this Failed to load the component rasa

you could also have your custom action dispatch different messages and this way you can remove the 2nd action utter_ and probably you’d have one single story in this case:

@anca Yes this works. My custom action dispatches different messages based of if an entity location was found in a user message or not. For now, I haven’t implemented slots in this project

I don’t know how else to explain that the tests are not failing. Maybe I don’t understand how some of these things work under the hood but the some of the tests in failed_test_stories.yml aren’t failing when I test my assistant (with rasa shell). Moreover, my knowledge is that filed stories comes with an extra comment as to what was predicted for example ( # predicted: nlu_fallback: bye or # predicted: action_default_fallback. But yea, I will keep studying and working on this

Speaking of action_default_fallback. Please do you know of suggested way to improve the model to solve this error? Testing the same path I wrote in my stories, (for most), the last action returns action_default_fallback. I have tried playing with the no of epochs, no revolve still.

Looking at the logs for one of the stories, I see

2021-07-29 02:10:53 DEBUG    rasa.core.policies.ted_policy  - TED predicted 'utter_not_sure' based on user intent.

(which is the action I have specified).

looking further, I also see:

2021-07-29 02:10:53 DEBUG    rasa.core.processor  - Predicted next action 'action_default_fallback' with confidence 0.30.

Wondering why the latter was chosen.

EDIT: I have specified entity type for my story (as below) and I have it working now:

- intent: specify_location_lockdown
    entities:
    - location: "test_location"

Additionally, I must add that you have been very helpful. Just so I don’t to forget to mention.

Hi @anca wondering if you got my reply above.