Rasa refuses to work properly, for further flow developments

While developing upon existing flows, for modification or addition of new features, we are facing a lot of issues. @stephens, please help. This is going to be a long, but critical post.

The most frequent error that we are facing is the circuit breaker tripped error. The other one, being the case where the next action is action_default_fallback. These are happening while developing more functionalities in existing flows. Because our project is large, we should test all the flows every time a small change is made in any of the flows, which adds to the time duration of a feature development, in addition to the time taken for training the rasa models.

Following is an example where our existing flows are affected, while making incremental changes, resulting in us spending close to 3 weeks debugging the issues and trying to test on different virtual environments and servers.

- story: Global outage check | INW
  steps:
  - or:
    - intent: connection_problem
    - intent: router_options
    - intent: change_wifi_name_or_password
    - intent: internet_auto_login
    - intent: change_router_password
    - intent: new_router_config
    - intent: forgot_wifi_password
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
    - is_global_outage_action_check: true
    - is_global_outage_continue: null
  - action: utter_global_outage_confirmation
  - intent: affirm
  - action: action_global_outage_continue

- story: Global outage check | Deny
  steps:
  - or:
    - intent: connection_problem
    - intent: router_options
    - intent: change_wifi_name_or_password
    - intent: internet_auto_login
    - intent: change_router_password
    - intent: new_router_config
    - intent: forgot_wifi_password
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
    - is_global_outage_action_check: true
    - is_global_outage_continue: null
  - action: utter_global_outage_confirmation
  - intent: deny
  - action: utter_global_outage_msg
  - action: action_utter_flow_feedback_form

These are the stories which help the customer know if there is a global outage. If there is no global outage, the user is directly taken to one of the flows dependent on internet connection, like ā€˜Internet Not Workingā€™. If there is a global outage, we ask the customer whether he wants to continue to query or register a complaint, in spite of there being a global outage.

The global outage message is newly configured to be fetched from a mongo DB instance. We are not sure if this is causing the issue.

from typing import Any, Text, Dict, List
from rasa_sdk import Action, Tracker
from rasa_sdk.events import SessionStarted, ActionExecuted, SlotSet, UserUttered, FollowupAction
import requests
import json, xmltodict
from db import ReferMessageModel

class GlobalOutageCheck(Action):
        
    def name(self) -> Text:
        return "action_global_outage_check"

    async def run(
        self, dispatcher, tracker: Tracker, domain: Dict[Text, Any]
    ) -> List[Dict[Text, Any]]:

        user_id = tracker.get_slot("user_id")
        account_number = tracker.get_slot("accountno")
        city = tracker.get_slot("city")
        mobile = tracker.get_slot("cellular_phone_num")
        global_outage_continue_slot = tracker.get_slot("is_global_outage_continue")
        global_outage_check_slot = tracker.get_slot("is_global_outage_action_check")

        # if global_outage_continue_slot is None and global_outage_check_slot is None:

        slots = []
        slots.append(SlotSet('inw_roucon_trigger_intent_name', tracker.latest_message["intent"]["name"]))

        inw_list = ['connection_problem','slow_site','slow_speed']
        router_config_list = ['router_config','change_wifi_name_or_password','wifi_name_and_password','internet_auto_login']

        global_outage_city_list = list(ReferMessageModel.list(city=city.upper()))
        warning_msg = global_outage_city_list[0]['Warning_message']

        if warning_msg != '' and warning_msg is not None:
            
            slots.extend([SlotSet('is_global_outage_action_check', True),
                            SlotSet('is_global_outage_continue', None),
                            SlotSet('global_outage_msg', warning_msg)])
            
            # return slots
            return slots + [ActionExecuted("action_listen")] + [UserUttered("/" + tracker.latest_message["intent"]["name"], {
                "intent": {"name": tracker.latest_message["intent"]["name"], "confidence": 1.0},
                "entities": []
            })]
        
        else:
            
            slots.append(SlotSet('is_global_outage_action_check', True))
            slots.append(SlotSet('is_global_outage_continue', True))
            return slots
            # if tracker.latest_message["intent"]["name"] in inw_list:
            #     return slots+self.trigger_next_action("action_inw_ticket_check")
            # elif tracker.latest_message["intent"]["name"] in router_config_list:
            #     return slots+self.trigger_next_action("action_config_ticket_check")
            # else:
            #     return slots
        # else:
        #     return []
    
    def trigger_next_action(self, followup_action):
        
        return [ActionExecuted("action_listen")] + [FollowupAction(followup_action)]

class GlobalOutageContinue(Action):
        
    def name(self) -> Text:
        return "action_global_outage_continue"

    async def run(
        self, dispatcher, tracker: Tracker, domain: Dict[Text, Any]
    ) -> List[Dict[Text, Any]]:

        last_intent = tracker.get_slot("inw_roucon_trigger_intent_name")
        global_outage_continue_slot = tracker.get_slot("is_global_outage_continue")
        trigger_intent_names = tracker.get_slot("trigger_intent_names")

        slots = []

        inw_list = ['connection_problem']
        router_config_list = ["change_router_password","new_router_config","forgot_wifi_password","change_wifi_name_or_password","internet_auto_login"]
        
        if global_outage_continue_slot is None:
            slots.append(SlotSet('is_global_outage_continue', True))
            # return slots
            if trigger_intent_names is None:
                trigger_intent_names = {
                    "router_form_trigger_intent": last_intent
                }
            else:
                trigger_intent_names["router_form_trigger_intent"] = last_intent
            
            slots.append(SlotSet("trigger_intent_names", trigger_intent_names))
            if last_intent in inw_list:
                
                return slots+self.trigger_next_action("action_inw_ticket_check")
                # return slots+[ActionExecuted("action_listen")] + [UserUttered("/connection_problem", {
                #     "intent": {"name": "connection_problem", "confidence": 1.0},
                #     "entities": []
                # })]
                return slots
            elif last_intent in router_config_list:
                return slots+self.trigger_next_action("action_config_ticket_check")
                # return slots+[ActionExecuted("action_listen")] + [UserUttered("/router_config", {
                #     "intent": {"name": "router_config", "confidence": 1.0},
                #     "entities": []
                # })]
                return slots
            else:
                return slots
        else:
            slots.append(SlotSet('is_global_outage_continue', True))
            return slots
    
    def trigger_next_action(self, followup_action):
        
        return [ActionExecuted("action_listen")] + [FollowupAction(followup_action)]

In most cases, the chatbot shows the global outage message, configured from the external DB. This is the utterance for it:

  utter_global_outage_confirmation:
  - buttons:
    - payload: /affirm
      title: Yes
    - payload: /deny
      title: No
    text: "{global_outage_msg}"

Problem is that, after clicking on ā€˜Yesā€™, the sob story begins.

We see that action_global_outage_check is predicted multiple times in a row, in the action server. We also see that the message ā€œCircuit breaker tripped. Stopped predicting more actions forā€¦ā€ appears in yellow in the rasa model server.

In other cases, clicking on ā€œRouter optionsā€ or some other button results in a custom_fallback response.

We made these story changes in another yml file, as part of the business logic change:

- story: Internet not working | modified true | affirm | new | cupu_ontol
  steps:
  - intent: connection_problem
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
  - action: action_global_outage_check
  - slot_was_set:
    - is_global_outage_action_check: true
    - is_global_outage_continue: true
  - action: action_global_outage_continue
  - action: action_inw_ticket_check
  - slot_was_set:
    - ticket_details: modified
  # - action: action_inw_ticket_check
  - action: utter_reopen_ticket_prompt
  - intent: affirm
  - action: action_inw_reopen_ticket
  - slot_was_set:
    - inw_reopen_tkt_srno: SRnumber
  - action: action_session_nhv
  - slot_was_set:
    - nhv_status: cupu_ontol
  - action: utter_inw_ticket_reopened
  - action: action_perform_live_agent_handover_thd

- story: Internet not working | modified true | affirm | new | nhv not_cupu_ontol
  steps:
  - intent: connection_problem
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
  - action: action_global_outage_check
  - slot_was_set:
    - is_global_outage_action_check: true
    - is_global_outage_continue: true
  - action: action_global_outage_continue
  - action: action_inw_ticket_check
  - slot_was_set:
    - ticket_details: modified
  # - action: action_inw_ticket_check
  - action: utter_reopen_ticket_prompt
  - intent: affirm
  - action: action_inw_reopen_ticket
  - slot_was_set:
    - inw_reopen_tkt_srno: SRnumber
  - action: action_session_nhv
  - slot_was_set:
    - nhv_status: not_cupu_ontol
  - action: utter_inw_ticket_reopened
  - action: action_utter_flow_feedback_form

- story: Internet not working | modified true | affirm | new | nhv cupd_ontof
  steps:
  - intent: connection_problem
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
  - action: action_global_outage_check
  - slot_was_set:
    - is_global_outage_action_check: true
    - is_global_outage_continue: true
  - action: action_global_outage_continue
  - action: action_inw_ticket_check
  - slot_was_set:
    - ticket_details: modified
  # - action: action_inw_ticket_check
  - action: utter_reopen_ticket_prompt
  - intent: affirm
  - action: action_inw_reopen_ticket
  - slot_was_set:
    - inw_reopen_tkt_srno: SRnumber
  - action: action_session_nhv
  - slot_was_set:
    - nhv_status: cupd_ontof
  - action: utter_inw_ticket_reopened
  - action: action_utter_flow_feedback_form  

- story: Internet not working | modified true | affirm | new | nhv api_error
  steps:
  - intent: connection_problem
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
  - action: action_global_outage_check
  - slot_was_set:
    - is_global_outage_action_check: true
    - is_global_outage_continue: true
  - action: action_global_outage_continue
  - action: action_inw_ticket_check
  - slot_was_set:
    - ticket_details: modified
  # - action: action_inw_ticket_check
  - action: utter_reopen_ticket_prompt
  - intent: affirm
  - action: action_inw_reopen_ticket
  - slot_was_set:
    - inw_reopen_tkt_srno: SRnumber
  - action: action_session_nhv
  - slot_was_set:
    - nhv_status: api_error
  - action: utter_inw_ticket_reopened
  - action: action_utter_flow_feedback_form

- story: Internet not working | modified true | deny
  steps:
  - intent: connection_problem
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
  - action: action_global_outage_check
  - slot_was_set:
    - is_global_outage_action_check: true
    - is_global_outage_continue: true
  - action: action_global_outage_continue
  - action: action_inw_ticket_check
  - slot_was_set:
    - ticket_details: modified
  # - action: action_inw_ticket_check
  - action: utter_reopen_ticket_prompt
  - intent: deny
  # - action: utter_ask_restart
  - action: action_utter_flow_feedback_form

Here are some additional pre-existing stories which depend on the global outage feature, to provide context:

version: "2.0"
stories:
- story: Internet not working | Happy path CUPU ONTOL
  steps:
  - intent: connection_problem
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
  - action: action_global_outage_check
  - slot_was_set:
    - is_global_outage_action_check: true
    - is_global_outage_continue: true
  - action: action_global_outage_continue
  - action: action_inw_ticket_check
  - slot_was_set:
    - ticket_details: no_ticket
  - action: action_session_nhv
  - slot_was_set:
    - nhv_status: cupu_ontol
    - flag: auto_login_flag
  - action: action_utter_flag_one_confirmation
  - intent: affirm
  - action: action_flag_one
  - intent: affirm
  - action: utter_issue_resolved
  - action: action_inw_auto_login_flag_ftr
  - action: action_utter_flow_feedback_form

- story: Internet not working | Happy path CUPU ONTOL issue resolved deny part
  steps:
  - intent: connection_problem
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
  - action: action_global_outage_check
  - slot_was_set:
    - is_global_outage_action_check: true
    - is_global_outage_continue: true
  - action: action_global_outage_continue
  - action: action_inw_ticket_check
  - slot_was_set:
    - ticket_details: no_ticket
  - action: action_session_nhv
  - slot_was_set:
    - nhv_status: cupu_ontol
    - flag: auto_login_flag
  - action: action_utter_flag_one_confirmation
  - intent: affirm
  - action: action_flag_one
  - intent: deny
  - slot_was_set:
    - ticket_user_input: null
  - action: utter_enter_comments
  - action: ticket_user_input_form
  - active_loop: ticket_user_input_form
  - active_loop: null
  - action: action_home_redirect
  - slot_was_set:
    - ticket_user_input: test issue
  - action: action_create_element_down_sr
  - action: action_perform_live_agent_handover_thd

- story: Internet not working | Happy path CUPU ONTOL | Router Form affirm
  steps:
  - intent: connection_problem
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
  - action: action_global_outage_check
  - slot_was_set:
    - is_global_outage_action_check: true
    - is_global_outage_continue: true
  - action: action_global_outage_continue
  - action: action_inw_ticket_check
  - slot_was_set:
    - ticket_details: no_ticket
  - action: action_session_nhv
  - slot_was_set:
    - nhv_status: cupu_ontol
    - flag: auto_login_flag
  - action: action_utter_flag_one_confirmation
  - intent: deny
  # - action: utter_make_and_model
  - action: action_utter_router_form
  - action: router_form
  - active_loop: router_form
  - active_loop: null
  - action: action_home_redirect
  - action: action_utter_router_configuration_guide
  - action: utter_query_resolve_question
  # - checkpoint: trigger_router_form
  - intent: affirm
  - action: utter_issue_resolved
  - action: action_inw_auto_login_flag_ftr
  - action: action_utter_flow_feedback_form

- story: Internet not working | Happy path CUPU ONTOL | Router Form deny
  steps:
  - intent: connection_problem
  - slot_was_set:
    - is_authenticated: true
    - is_activated: Y
  - action: action_global_outage_check
  - slot_was_set:
    - is_global_outage_action_check: true
    - is_global_outage_continue: true
  - action: action_global_outage_continue
  - action: action_inw_ticket_check
  - slot_was_set:
    - ticket_details: no_ticket
  - action: action_session_nhv
  - slot_was_set:
    - nhv_status: cupu_ontol
    - flag: auto_login_flag
  - action: action_utter_flag_one_confirmation
  - intent: deny
  # - action: utter_make_and_model
  - action: action_utter_router_form
  - action: router_form
  - active_loop: router_form
  - active_loop: null
  - action: action_home_redirect
  - action: action_utter_router_configuration_guide
  - action: utter_query_resolve_question
  - intent: deny
  - slot_was_set:
    - ticket_user_input: null
  - action: utter_enter_comments
  - action: ticket_user_input_form
  - active_loop: ticket_user_input_form
  - active_loop: null
  - action: action_home_redirect
  - slot_was_set:
    - ticket_user_input: test issue
  - action: action_create_router_config_sr
  - action: action_perform_live_agent_handover_thd

We are not able to pinpoint the cause of the malfunction of the chatbot, even after checking on multiple virtual environments and servers. We even tried commenting the probable problematic stories, but still these issues were remnant.

More information:

> Rasa Version      :         2.8.17
> Minimum Compatible Version: 2.8.9
> Rasa SDK Version  :         2.8.4
> Rasa X Version    :         None
> Python Version    :         3.7.9
> Operating System  :         Linux-3.10.0-862.el7.x86_64-x86_64-with-centos-7.5.1804-Core

Please let us know what else we need to try, or if you want more information to help you come up with a solution.

Regards

Iā€™m not going to try and analyze everything here since that would take too long but Iā€™ll highlight some possible issues.

Long stories and use of many featurized slots is not a good idea. I would refactor the bot. Use rules, forms and unfeaturized slots first.

I would also upgrade Rasa, 2.8.17 is getting old

What is the right way to handle large stories? We need to write the business logic that way and there are branching out stories according to it. If we should not rely on many slots to control the logic flow, what would be the best way to write rules and stories?

What is the right way to handle large stories?

Forms

We need to write the business logic that way and there are branching out stories according to it

Use a form for each key flow and if there is a major change in the flow, you can switch from one form to another. I posted an example similar to this here.

If we should not rely on many slots to control the logic flow, what would be the best way to write rules and stories?

Iā€™m not recommending that you donā€™t use slots, Iā€™m saying to avoid ā€œfeaturized slotsā€ which means donā€™t use influence_conversation: true unless you have a very good reason to do this.

For new bots, I set all slots to influence_conversation: false and only change it to true if I really need the slot to control the flow in a rule or story. When using forms, you can control the flow in the form without a featurized slot.

1 Like