How to modularize multi branching sequence

Hi, I’m new to Rasa, and I’ve seen several examples in the [docs] (Writing Conversation Data) about how to manage conversation flow, but I’m not sure how those generalize to more complex branching, or what’s the underlying design principle to follow.

Let’s take an example:

  • A bot that provides technical support that checks if the user has tried a series of steps before, only branching to guide the user through those steps if they haven’t tried them yet. Eg ‘did you try to reboot your pc?’ -> ‘yes’, ‘did you scan for viruses?’ -> ‘yes’, ‘did you run diagnostics?’-> ‘no’, ‘then do this, and this… is it resolved now?’, etc.

In this cases, I imagine having to write positive and negative cases for every question. If I write each posible path from begining to the moment the problem is solved (or there’s no more questions) then id have to repeat a lot of parts (big chunks of stories) multiple times.

I have some ideas, but don’t know if they’re correct.

  • Using lots of checkpoints. The docs say to not overuse them, or reserve them to pieces used multiple times. Here each checkpoint would be used 2 times I think, but I don’t know what’s the specific problem with that. If, during training, checkpoints are combined into all possible branches, that would be fine, that’s precisely what I want to avoid doing manually. And I’m not sure what the problem with readability is either (any modularization creates some readibility concerns).
  • Using separate intents for each ‘yes’ or ‘no’ answer, for example with buttons that give a specific intent as payload. Doesn’t seem very elegant, specially considering that writing the same answer as the buttons would break it. But it would allow chaining rules.
  • Rules with conditions for slots. But that would require generating lots of different slots for the same two answers, which I think would require a lot of custom actions.
  • Using slots for every user answerr and a single custom action that tracks the state of the conversation and aswers accordingly. The abuse of slots doesn’t seem very elegant, and while regular code has more flexilibity, it also requires the creation of some path-handling aditional code. It kinda moves the problem from yaml to regular code.
  • Re-structure conversation. For example, as a form where all questions are asked up front, and then handled using information on slots and custom actions. This seems unnintuitive and tedious for the user, specially if theres lots of possible questions.
  • Creating logical breaks in stories. I actually like this one the most. I could chain one-question forms at the end of each other, if I wanted to slot, or just chain using the last utterance. But does it work when breaking a long series of questions? will it lose context? If this works, why do checkpoints exists at all?

This is a really great question that may be a little too complicated for this forum. Having said that, I’d love to see some replies here with approaches people are taking to defining best approaches to handling this.

Hi all,

We face the same questions! In our case we would like to modularize parts of our stories for readability. First we tried it with checkpoints but that leads to too many trackers to process during training. Now we try to break up blocks with the last point you mentioned, the logical breaks. But now the memoization policy does not work at the “connection points” of the story blocks and the TED policy is not very accurate at this points.

Does increasing max_history improve your results? Maybe it’ll provide more context to predict the next action.

Guess you could also introduce slots to provide more information at the “connection points”, though it doesn’t seem very elegant.

Hi @FearsomeTuna

The max_history does not matter in this case. If the stories are build from start to end there is no problem. I will illustrate this with an example:

Original story:

  • intent A
  • action B
  • intent C
  • action D
  • intent E
  • action F
  • intent G
  • action H

Approach to modularize with story blocks:

story1

  • intent A
  • action B
  • intent C
  • action D

story2

  • action D
  • intent E
  • action F
  • intent G
  • action H

Approach to modularize with checkpoints:

story1

  • intent A
  • action B
  • intent C
  • action D
  • checkpoint X

story2

  • checkpoint X
  • intent E
  • action F
  • intent G
  • action H

The original story works fine but if we generate a start to end story for every possible story, we end up copy-pasting thousands of lines which can’t be the point. If we split story blocks like in the 2nd approach the memoization policy doesn’t work at action D, even with following addition:

story3

  • intent C
  • action D
  • intent E
  • action F

The memoization does work until action D and then TED policy takes over (with big uncertainty).

The last approach (with the checkpoints) works fine for the memoization policy, but this leads to lots of trackers to be processed during training. Thus this would be the “overusing” case of checkpoints which is warned about. The documentation tells us to do it with story blocks (approach 1) but this does work unsatisfactorily.

Anyway it would be nice if someone could explain how to approach this problem in general.

@thinkinnet I don’t understand why checkpoints need to have multiple trackers. There not much in the docs about them besides the idea they connect stories. It seems checkpoints handle the issue at a much too high level.

Do we need the pipeline to know about these checkpoints at all? Maybe I’m oversimplifying, but after the yaml is parsed, it seems we just need to go over the story keys that have the checkpoint marker at the end and begining and create new objects that recombine them in all ways, before handing them to the pipeline.

Maybe something between the yaml parser and the pipeline, or perhaps a separate preprocessor that operates over the raw yaml files (I’m just guessing, I don’t know the inner workings of rasa).

Does that make sense for a new feature request? Or am I way off?

Hi

We can play guessing games here but I would like to add @Ghostvv to this discussion. @Ghostvv hopefully you can give us some insights in the topic of modularizing stories.

Thanks in advance

@FearsomeTuna It looks like I misunderstood the principle working of the memoization policy. If I use the AugmentedMemoization policy then the first approach works. Maybe this will help you too!

Hi @thinkinnet! I am trying to do the same first approach (separating long stories into smaller parts using common actions) but the stories start breaking. I am currently using AugmentedMemoization policy with a max history of 4. Does increasing the max history helps?

I am currently using checkpoints and it works perfectly but it’s taking too long to train and various sources also discouraged it. By any chance do you know other alternatives to replace checkpoints?

[Edit] When I wrote story breaking it means that it runs “action_listen” after action D in the first approach. Is it because at the end of story 1 there is no action after action D (which imples “action_listen”) and rasa trained on it? Is there a way to avoid this problem?

I don’t remember if memoization requires all story steps from the start to match. I’ve seen some people recommend setting max_history to the number of steps of your longest story, since that would be the longest posible sequence that distinctly identifies a story.

I also recommend debugging with -vv option. That way you can see the tracker status history at each prediction and get a better idea of why prediction is not what you expect. Memoization requires an exact match, so if you want to rely on it for your design, I suggest getting familiar with the way the tracker state works. You can also use the interactive story generator tool to assist you with this.

Hi @FearsomeTuna, I tried to debug and realized that the “action_listen” is predicted with confidence 1.0. Will try to explore more on memoization and use checkpoints in the meantime. Did you manage to modularize parts of your stories?

If you break stories into several parts, you need to consider that stories after the first don’t infer the context implicitely. In other words, the model doesn’t know you intend to “glue” stories together just because the ending and starting actions match. To the model, they’re stories that just happen to be that way that could be unrelated. For example, if you have set a slot with option influence_conversation set to true during story 1, I think story 2 would need to declare those slots as set with slot_was_set to correctly match the tracket state.

I’m a bit fussy on this (i already finished the little project I was working on some time ago), but if I remember correctly, unless you want another action inmediately after the last one, action_listen would be the expected prediction if the next step is an intention derived from user input. Unless you actually meant to say you got action_listen after intent E in the first approach mentioned by thinkinnet.

Yes, I managed to modularize some stories, but found that, unless you’re going to reuse them several times, it may not be worth the trouble. If, say, you’re trying to modularize a story so that you can avoid duplicating a small segment of it 2 or 3 times, I’d say it’s not worth it.

Thanks for the advice. However, I believe slot_was_set is an event by itself and I am not too sure if there are stories that branch out based on slot set a few conversations away (not by the latest conversation which is stated in documentation). Let me try it out on my side.

If I remember correctly, slot_was_set does not generate an event by itself, in the sense that it doesn’t actually trigger the slot setting itself when it’s reached. Slot setting is done before that, elsewhere (in custom actions, forms, or automatically during intent classification). slot_was_set is used to unambiguously describe context at any point within story declaration. If a slot has influence_conversation set to true, then story declaration should be interpreted in the following way (take this with a big chunk of salt, as I’m trying to remember my experiments):

  • If up to a particular point within a single YAML story declaration, that slot has not been declared to be set —either by means of slot_was_set or by the entity: optional key that accompanies the intent: key (asumming automatic filling is on)—, then the slot is implicitely asumed to be declared as NOT set, and memoization policies will require that condition when cheking if the current state matches that of the specific story.
  • If the slot WAS declared to be set within a specific YAML story declaration, then memoization policies will, naturally, require that condition when cheking if the current state matches that of the specific story.

This means that for augmented memoization:

  • Declarations of a slot being set —either by means of slot_was_set or by the entity: optional key that accompanies the intent: key (asumming automatic filling is on)— in a specific YAML story, do not carry on to other YAML story declarations that happen to go after the former one during an actual session.
  • The model in general cannot infer the intended state for a particular story from information outside of the explicit YAML story declaration. Not from other stories and not from custom actions.

[edit] as I look to the docs, it seems there’s a new example in the section about creating logical breaks. That may seem to indicate that I’m wrong on what I just said, but I don’t know if that example works because of memoization or because of TED policy (which would be different cases, since memoization is more predictable). Any way, it’s enough to make me doubt myself, so I edited the previous explanation to reflect that it should be taken with a big chunk of salt, intead of a grain.