End-to-end Model Architecture

j.mosig · June 10, 2021, 8:17am

We don’t have good material on this, unfortunately, as the feature is still experimental. On a high level, what TED does with plain text during training is this:

At training time, the data contains only either user text or user intent, not both. TED learns to do predictions with either one. So to ensure that the test distribution is the same as the training distribution, we run TED with a batch of 2 at inference time when a new user message comes in. One batch example where the last user message has the intent label that comes from the NLU pipeline, and one where it is featurized by its plain text (as in the picture). The text-based prediction is chosen if and only if its confidence is above some threshold and the maximum similarity score is higher than that for the intent based prediction. See here. This is ok because the similarities come from exactly the same model. We then store which choice was made, so at the next dialogue step, the dialogue history is featurized according to these decisions made (intent label or text for each turn, even though both would be available at inference time).

Topic		Replies	Views
End-to-end Training [Experimental] Important Updates	33	6298	September 22, 2022
Stories and conversations - is my mental model right? Rasa Open Source	3	951	August 16, 2018
About chatGPT's architecture Rasa Open Source	7	2704	July 27, 2023
Rasa NLU in Depth - Part 1: Intent Classification Tutorials, Resources & Videos	0	2697	February 21, 2019
Anyone written code to generate end-to-end stories from "regular" stories? Rasa Open Source	4	345	July 15, 2020

End-to-end Model Architecture

Related topics