What's the future goal of embedding policy?

I have read the embedding policy paper. This model is powerful due to transfer learning. Furthermore, it is able to rollback to a specific context using the attention mechanism.

However, I am not sure whether this model is necessary to build the goal oriented dialogue system. Rule-based approach, such as the form policy and memorization policy, is almost enough to build all goal oriented dialogue system.

I have checked the training data and testing data in your open source. (GitHub - RasaHQ/conversational-ai-workshop-18: Example showing generalisation) I think all dialogue patterns in this data can be handled according to the following rules.

  1. When AI requests a slot, user answer it. -> AI requests another slot or provides the final action.
  2. When AI requests a slot, user answer another message.(unhappy path) -> AI must respond to that message and then request the slot.

It is very simple… I do not know why dialogue data should be necessary for training embedding policy.

I would like to know an example story that embedding policy can handle and the rule based approach can not handle.

Thanks is advance :slight_smile:

Hi, I also have similar question regarding the embedding policy. Form action also can handle slot filling and take care of unhappy path given the enough unhappy path stories.

(TLDR;) So my question is: What’s the difference between the embedding policy and the form action? And what’s the goal of the embedding policy?

The result of REDP compared to LSTM looks fantastic and I am excited to try it out.

Let’s look at the figure V2 of REDP (Attention, Dialogue, and Learning Reusable Patterns). The “with restaurant” plot means that there was both unhappy and happy restaurant data during training. For the transfer learning to take effect, there should have been improvement even when there wasn’t any “uncooperative hotel dialogues” data present during training. Am I correct? It seems more of data augmentation than transfer learning.

Thanks in advance.

Many tasks can be solved with carefully written rules, however it does not scale well. Therefore we are actively researching new policies that could learn similar rules but without explicit programming.

I also agree to build a neural network that can learn from data without explicit programming. However, we need to define a example problem that is difficult to program. The example problems described on the paper(embedding policy) are easy to program as mentioned above.