End-to-end Training [Experimental]

Hey Rasa @community,

One year ago we wrote that it’s about time we get rid of intents, and about how we see a future beyond the limitations of them. In order to build level 5 assistants, we believe that we should not remain stuck in the mindset that every user message has to neatly fit into one of our predefined intents.

With Rasa Open Source 2.2, we released a new experimental feature called end-to-end training, it allows you to train the dialogue policy directly on user text without separate NLU data. So instead of a two step process (an NLU prediction followed by a dialogue policy choosing the next action), Rasa can now directly predict the next action the bot should take by looking at the message the user sent.

As you work on an assistant over time to make it more sophisticated, end-to-end learning allows you to keep evolving and improving without being limited by a rigid set of intents. The benefit of this approach is that it makes intents optional.

To get a more in depth understanding of how End-to-End Training works in Rasa Open Source, check our latest blog post:

It’s important to note that, since this is an experimental feature, we don’t have full support yet across all Rasa features like interactive learning, or Rasa X.

For now, think of end-to-end learning as a feature for advanced teams who want to push the limits of what Rasa Open Source can do. This has been a massive joint effort from our research and engineering teams, and we believe it’s a major piece of the puzzle towards better conversational AI. As we get feedback and learn about how to best use end-to-end in production systems, we’ll build more tooling, provide more examples and docs, and turn this into a feature our whole community can benefit from.

27 Likes

This looks like a really exciting feature! Can’t wait to experiment with this next year!

As part of We’re a step closer to getting rid of intents, the training example towards the end of the post:

version: "2.0"

stories:
- story: end to end happy path
  steps:
  - user: “hi”
  - bot: “hi!”
  - user: “I’m looking for a restaurant”
  - bot: “how about Chinese food?”
  - user: “sure”
  - bot: “here’s what I found ...”

Was wondering how to extract entities like cuisine (chinese in this case)?

3 Likes

Great progress, this sounds very interesting. Especially for handling ambiguous, context-sensitive utterances. One question for now: Let’s say I have more than one concrete user utterance I would like to use in story at a specific point as an alternative to an intent. Do I in this case create two (or more) completely separate stories or can I somehow combine the concrete utterance variants in one single story?

6 Likes

Sounds super exciting. I’m just about to start the design of a new bot next week. Not super critical, so would you say that I could best start with this feature right away rather than first using the ‘traditional way’ and later rebuild?

exciting

Please take a look at the docs for how to create stories for e2e training: Training Data Format

You can mark entities in user text in the same way, you mark entities in the NLU data

version: "2.0"

stories:
- story: end to end happy path
  steps:
  - user: “hi”
  - bot: “hi!”
  - user: “I’m looking for a restaurant”
  - bot: “how about [Chinese](cuisine) food?”
  - user: “sure”
  - bot: “here’s what I found ...”
2 Likes

you need to create as many stories as you have utterances. This is why we keep intents, because quite often, if you can imagine a lot of such utterances, it is better to create an intent for it

3 Likes

I don’t suggest to use it in production. I’d recommend to do it in a ‘traditional way’. Why do you want to rebuild it later?

Thanks @Ghostvv. At first I didn’t pick up on that intents will still exist next to it; that’s clear now and no rebuild is required to ‘a next version’.

1 Like

glad to clarify. Intents are very useful, but we expect e2e to be useful as well, we’re trying to find a “mixed” approach (you can have intents and texts in the same story, but in different dialogue turns) in the same sense as we implemented rules so that they work together with ML.

3 Likes

I am wondering if you all have thought of a hybrid approach instead of deprecating intents in later releases. That is, you train the model end-to-end, but use scaffolding as with the MOSS framework that trains on all available data (using them as intermediate outputs):

In this way, the user can provide data in the end-to-end format, traditional format, or a mixture.

I see Vladimir is already on this post, but thought I’d loop in @amn41 as well.

3 Likes

as I said above, we’re not deprecating intents, you can have hybrid stories:

stories:
- story: full end-to-end story
  steps:
  - intent: greet
    entities:
    - name: Ivan
  - bot: Hello, a person with a name!
  - intent: search_restaurant
  - action: utter_suggest_cuisine
  - user: I can always go for [sushi](cuisine)
  - bot: Personally, I prefer pizza, but sure let's search sushi restaurants
  - action: utter_suggest_cuisine
  - user: Have a beautiful day!
  - action: utter_goodbye
1 Like

in a sense our current approach with intents is a modular supervision approach

2 Likes

with our hybrid e2e approach, we try to solve the problem when intermediate labels are unknown, therefore there is no supervision signal from intent labels at all. And we implement it so that you can have all the different mixtures.

1 Like

what the meaning of " user turns"?

when are you going to plan this to do practically.

“user turn” - is the dialogue turn that contain an input from the user

what do you mean? end-to-end training is released in rasa open source 2.2

1 Like

Great!