End-to-end Training [Experimental]

Hi there, If one builds a bot entirely based on pre-existing live chat messages, would it make sense to:

  1. “Clean” these conversations (remove redundancies, non useful paths etc)
  2. use e2e training only
  3. introduce intent later only if it saves significant training time?

Thank you!

I have 400 pure e2e stories from historic conversational data. I am getting OOM error while training core ( rasa train --augmentation 0 ). I am using 1 GPU with 12 GB RAM. I am using this config.yml.

pipeline:
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
  analyzer: "char_wb"
  min_ngram: 1
  max_ngram: 4
- name: DIETClassifier
  epochs: 100
- name: EntitySynonymMapper
- name: ResponseSelector
  epochs: 100
policies:
- name: TEDPolicy
  epochs: 10
  max_history: 5
- name: RulePolicy

I have tried changing batch_size but nothing helped.

PS. Its working file for smaller data set. It worked fine for 50 e2e stories.

2 Likes

@vikrant67 Were you able to resolve this challenge with getting your OOM error? I’m interested in trying a similar experiment, and would love to learn from your experience so far.

Tom

was trying out this feature, but got an error during the training stage of rasa core

  File "/media/dingusagar/rasa_2_8/lib/python3.7/site-packages/rasa/core/featurizers/tracker_featurizers.py", line 960, in <listcomp>
[domain.intents.index(intent) for intent in tracker_intents] 
ValueError: 'my order is late' is not in list

My stories look like this :

  • story: Order late steps:
    • user: “my order is late”
    • action: utter_sorry_to_hear

I am using rasa 2.8, the config is the default one in rasa 2.8. from the error i feel like the user utterance is treated like an intent and its complaining that such intent is not present. Could someone help me understand what am i doing wrong.

What does your config.yml look like?

the config is the default one in rasa 2.8. no changes made to it.

language: en

pipeline:

  • name: WhitespaceTokenizer
  • name: RegexFeaturizer
  • name: LexicalSyntacticFeaturizer
  • name: CountVectorsFeaturizer
  • name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
  • name: DIETClassifier epochs: 100 constrain_similarities: true
  • name: EntitySynonymMapper
  • name: ResponseSelector epochs: 100 constrain_similarities: true
  • name: FallbackClassifier threshold: 0.3 ambiguity_threshold: 0.1

policies:

  • name: MemoizationPolicy
  • name: RulePolicy
  • name: UnexpecTEDIntentPolicy max_history: 5 epochs: 100
  • name: TEDPolicy max_history: 5 epochs: 100 constrain_similarities: true

@dingusagar Let’s deal with your question in this separate thread.

@tatianaf @inthematrix @sebastian @Nasnl @aymen @kearnsw @DeqianBai @Rajendra9 @Aspirinkb @mikeymms @vikrant67 I’d be interested in hearing about your experiences so far. In particular, what kinds of “conversation phenomena” do you struggle to solve with intents where e2e is useful? (e.g. “multi-intents”, “sarcasm”, “long user inputs”, etc.)

For me, the huge potential for e2e is the ability to use big historical chat logs of Human 2 Human chat conversations as a basis for training.

This is really interesting, and huge potential. Especially combined with some “power” annotation tools like Prodigy from Explosion.ai (https://explosion.ai/software#prodigy )

3 Likes

Has anyone experimented with this feature successfully? My main concerns about e2e training is in controllability - the model failing to learn how to handle various flows, especially for complicated dialogues that have multiple branches and subflows depending on slot values and such.

Let’s say I have more than one concrete user utterance I would like to use in story at a specific point as an alternative to an intent.

I think end-to-end training is a great concept, as classifying user utterances as intents and dialogues as stories is a hard job. Especially, when you’ve got a huge mass of real dialogues, with a lot of variations. In my opinion, Rasa’s force is the ability to train dialogue models on real dialogues, not only train the NLU!!

I also believe, end-to-end training appears to be a solution for the multiple intents problems and for user utterances with meanings depending on the context in dialogue history.

When applying CDD and to be in control, SME needs a tool like (former) Rasa X to manage training and test data.

So, I’m wondering, when Rasa Enterprise is going to support end-to-end training? This should include: talking to the bot with mixed stories - all combinations of {intents, responses/actions, end-to-end} -, analyse and annotate, save as either training story or as test story.

I’ve been experimenting with Rasa 3.1.0 OS and Rasa X community 1.1.0 but did not get the end-to-end stories into Rasa X. As far as I can see (and read about in posts and docs):

  1. It’s still not possible to put end-to-end training data in Rasa X, neither by hand, nor by Talk-To-Your-Bot
  2. A model that is trained on end-to-end data in Rasa open source and uploaded into Rasa X, throws errors when activated in Rasa X
  3. Of course, even when a trained model could be run by Rasa x, it would not be a great use, as we still can’t CDD on it in Rasa X!

Note: I’m aware of Alans news about stopping Rasa X community edition.
Beside this news, my organisation is still in exploring phase, when concerning Conversational Agents and choosing a platform.
As I’m very exited on both Rasa Open Source and Rasa X, Rasa Enterprise could be a candidate.
So, the answer to this question is of great importance in our decision making.

i have a problem when used end to end models like this.

whats wrong?

is this feature in invest?

Still wondering, when this feature will be available in #Rasa #Enterprise.

And if it is going to beat other CAI platforms!

@amn41 Do you have the answer?