How to let Memoization Policy "do the job"?

Dear RASA team,

I’ve screened the docs and every thread in this forum to find answers to my questions but wasn’t successful so I opened up this new question as it still seems to be a highly discussed topic: The goal is to let Memoization Policy decide on the next action whenever the actual conversational flow matches the trained stories. This seems to be straightforward as Memoization Policy is aimed to do exactly that. However, I observed some scenarios when this is not the case and I would like to better understand why TED is stepping in instead of the Memoization Policy:

Case 1: There are several stories that have the same flow but share some intents (mostly affirm and deny) and only differentiate withing having e.g. some chitchat in between or not. Here is an example:

stories:
- story: happy path intro 
  steps:
  - intent: greet
  - action: utter_greet
  - intent: affirm
  - action: action_start_conversation
  - intent: affirm
  - action: start_form

- story: happy path intro with chitchat
  steps:
  - intent: greet
  - action: utter_greet
  - intent: affirm
  - action: action_start_conversation
  - intent: chitchat
  - action: utter_chitchat
  - intent: affirm
  - action: start_form

Notice that the part:

  - intent: greet
  - action: utter_greet
  - intent: affirm
  - action: action_start_conversation

is the same in both stories. However, what I noticed is that “action_start_conversation” is always predicted by TED and not by Memoization Policy. So I was wondering whether Memoization Policy only predicts if there is ONE story that matches. As soon as there are TWO or more stories, TED seems to take over. My max history for the Memoization Policy is 3 while it is not set for TED. Setting a max history of e.g. 8 for TED actually led to many fallback-actions (but that’s a different question).

Case 2 If there is a slot within the story that is set by an action, the Memoization policy again does not predict the next action afterward even though it is a conditional response and always the same no matter what slot is set. Here is an example:

- story: happy path intro vegetarian food
  steps:
  - intent: interested_in_veg_food
  - action: utter_ask_ready_for_a_question
  - intent: affirm
  - action: action_reset_veg_food_slot
  - action: action_ask_veg_food # question is how often do they eat veg food with buttons, never, sometimes, or often
  - intent: rate_veg_food 
  - action: action_set_veg_food # it sets a slot with custom mapping within action.py
  - action: utter_conditional_response
  - action: action_set_heard_explanation

Here “utter_conditional_response” is predicted by TED even though it follows the trained story. Do I need to train for a slot set event in order to make the Memoization policy work? And if so, do I need to train it for each value then like one story for never, one for sometimes, and one for often? That would be a lot of additional stories. Or maybe train it with an OR condition?

EDIT: I trained the stories with each slot and a OR statement but again the conditional response was predicted by TED. So I was wondering whether this is because I have several actions by the bot in a row like here:

  - action: action_reset_veg_food_slot
  - action: action_ask_veg_food

while “action_reset_veg_food_slot” is also setting a slot silently. So will TED always take over if there are several actions in a row OR because there are slots set silently? I actually assumed that when slots are set silently in a custom action it would not impact the policies at all.

I hope someone could help me with that! It is not necessarily bad that TED does the job as it often (but not always) does a good job. But I would like to better understand Memoization Policy as I am putting so much effort into training good stories and it is a pity if they are then not recognized by the Policy.

Maybe @Tanja could also provide some insights if you have time :relieved: