Training was completely overhauled after a few minor changes in actions.py and a version upgrade

Not sure if I’m looking too much into it, but I’d like to be sure.

I have the following setup for a bot that I’m working on -

  1. On my local version, I start with a user utterance and try to carry out an entire conversation.
  2. On the production version, the first user utterance is passed to the Rasa bot after triggering a default intent which is followed by an action that initializes the required slots with the metadata. I use the triggerConversationIntent API call to trigger this default intent.

Both versions worked well and had responded well to various kind of inputs. There are more than 450 NLU examples classified under 17 intents, and about 50 stories, all mostly tiny variations around a common script.

That is until a recent change seemed to completely overhaul everything the bot has learned.

My recent changes include -

  1. Upgrade the bot from rasa and rasa-sdk 1.7.0 to 1.10.0
  2. Update the config file from the old default Rasa pipeline using pretrained word embeddings to using the new one involving conveRT and DIET Classifier.
  3. Update the policy to use TED instead of Keras.
  4. Add a dispatcher.utter_message line to the fallback function.

Nothing else changed. My core threshold for the fallback was 0.3.

I then completed training the bot locally and verified that the bot still works fine. However, when I pushed this version to the production version, it started to trigger the fallback intent about 90-95% of the time, with always the same confidence of 0.3 (which I thought was interesting). So even if I trigger the default intent through the API call with meta data, it would trigger the fallback action. I even tried to explicitly map the default intent to the action that loads the slots with the metadata, using the MappingPolicy. But to no avail.

I tried to recreate the production version locally, where I now explicitly pass the metadata through an utterance. And sure enough, 90-95% of all intents were followed by the same fallback action with the same confidence of 0.3 and every other action was set to 0.0.

Can someone please help me with what I’m doing wrong? I’m honestly worried that a simple change like this has completely undone 5 months of training.

PS - It would also be nice if I can manually trigger the default intent with specific slots inside Rasa X. (cc @tyd)

For reference, here’s my config.yml and was unchanged from the version upgrade:

# Current pipeline configuration for the NLU and rasa core
language: "en"

pipeline:
  - name: ConveRTTokenizer
    intent_tokenization_flag: True
    # Set to true to spot multiple intents
    case_sensitive: False
    intent_split_symbol: "_&_"
  - name: ConveRTFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
  - name: MappingPolicy          
  - name: FormPolicy
  - name: FallbackPolicy
    nlu_threshold: 0.6
    core_threshold: 0.3
    fallback_action_name: "r_fallback_action"

More updates on this one.

Even for basic intents like “greet” which come out of the box, the bot is now falling back.

Hi @ganeshv. This is interesting. One thing that I’ve noticed is that you have quite high NLU threshold. Does the performance of your assistant improve if you set it to a lower number?

It would be useful to see the output of your model in a debug mode as well, even for simple inputs like greetings. Could you share the logs of a simple conversation in a debug mode?

Hello @Juste! Thank you for your reply. My assumption was that the NLU isn’t a problem per se; it’s recognizing the right intents. It’s just stubbornly going to fallback every time.

I’ve temporarily reset it back to the 1.7.0 version so that it works with my production environment. Let me get back to you with the logs of the model in debug mode as it is running in production.

Hello @Juste; thank you for your patience. I was away from work for more than a month due to COVID-related reasons.

I have an update for you when I ran the advanced bot under debug mode. I tried both the 1.10.0 and the 2.0.0a1 full version and it gave me the following info in the log. The behavior was no different to my initial description and the fallback behavior was very odd, considering that I’ve trained the bot for these exact steps in the stories.md file more than 10 times.

To recall, these stories worked well in 1.7.0, but the upgrade seems to ignore the path laid in the stories and falls back at every opportunity.

Your NLU model classified 'No thank you' with intent 'deny' and there are no entities, is this correct?  Yes

2020-07-30 11:07:32 DEBUG    rasa.core.policies.memoization  - Current tracker state [{'slot_currentUrl_0': 1.0, 'prev_action_listen': 1.0, 'slot_userId_0': 1.0, 'slot_proposedAppointmentDateTimeHuman_0': 1.0, 'slot_numSeats_0': 1.0, 'slot_saleId_0': 1.0, 'slot_preferredAppointmentDateTime_0': 1.0, 'intent_affirm': 1.0}, {'slot_currentUrl_0': 1.0, 'slot_userId_0': 1.0, 'slot_proposedAppointmentDateTimeHuman_0': 1.0, 'prev_confirm_appointment': 1.0, 'slot_numSeats_0': 1.0, 'slot_saleId_0': 1.0, 'slot_preferredAppointmentDateTime_0': 1.0, 'intent_affirm': 1.0}, {'slot_currentUrl_0': 1.0, 'slot_userId_0': 1.0, 'slot_proposedAppointmentDateTimeHuman_0': 1.0, 'slot_numSeats_0': 1.0, 'slot_saleId_0': 1.0, 'prev_confirm_appointment_receipt': 1.0, 'slot_preferredAppointmentDateTime_0': 1.0, 'intent_affirm': 1.0}, {'slot_currentUrl_0': 1.0, 'slot_userId_0': 1.0, 'slot_proposedAppointmentDateTimeHuman_0': 1.0, 'slot_numSeats_0': 1.0, 'slot_saleId_0': 1.0, 'slot_preferredAppointmentDateTime_0': 1.0, 'intent_affirm': 1.0, 'prev_r_anything_else': 1.0}, {'slot_currentUrl_0': 1.0, 'prev_action_listen': 1.0, 'intent_deny': 1.0, 'slot_userId_0': 1.0, 'slot_proposedAppointmentDateTimeHuman_0': 1.0, 'slot_numSeats_0': 1.0, 'slot_saleId_0': 1.0, 'slot_preferredAppointmentDateTime_0': 1.0}]
2020-07-30 11:07:32 DEBUG    rasa.core.policies.memoization  - There is no memorised next action
2020-07-30 11:07:32 DEBUG    rasa.core.policies.form_policy  - There is no active form
2020-07-30 11:07:32 DEBUG    rasa.core.policies.fallback  - NLU confidence threshold met, confidence of fallback action set to core threshold (0.3).
2020-07-30 11:07:32 DEBUG    rasa.core.policies.ensemble  - Predicted next action using policy_4_FallbackPolicy
2020-07-30 11:07:32 DEBUG    rasa.core.tracker_store  - Recreating tracker for id 'ff09fd7752c241ccbc5ce9b92c9e10d4'

Let me know if you need anything else.

@ganeshv Would it be possible to share NLU data here so that I could try reproducing the project?

Hello @Juste, please allow me till early next week for a response. Our training files contain some confidential information and I’m discussing with my team internally on how to share this with you. Thanks in advance!

Hello @Juste - would it be possible for you to join me on a 15-minute call later this week or early next week to help debug this? I can be flexible about any timezone difference. Please let me know. Many thanks in advance!