I have read the embedding policy paper.
This model is powerful due to transfer learning.
Furthermore, it is able to rollback to a specific context using the attention mechanism.
However, I am not sure whether this model is necessary to build the goal oriented dialogue system.
Rule-based approach, such as the form policy and memorization policy, is almost enough to build all goal oriented dialogue system.
Hi, I also have similar question regarding the embedding policy.
Form action also can handle slot filling and take care of unhappy path given the enough unhappy path stories.
(TLDR;) So my question is:
What’s the difference between the embedding policy and the form action?
And what’s the goal of the embedding policy?
The result of REDP compared to LSTM looks fantastic and I am excited to try it out.
Let’s look at the figure V2 of REDP (Attention, Dialogue, and Learning Reusable Patterns).
The “with restaurant” plot means that there was both unhappy and happy restaurant data during training. For the transfer learning to take effect, there should have been improvement even when there wasn’t any “uncooperative hotel dialogues” data present during training. Am I correct? It seems more of data augmentation than transfer learning.
Many tasks can be solved with carefully written rules, however it does not scale well. Therefore we are actively researching new policies that could learn similar rules but without explicit programming.
I also agree to build a neural network that can learn from data without explicit programming. However, we need to define a example problem that is difficult to program. The example problems described on the paper(embedding policy) are easy to program as mentioned above.