AI-driven approach to my chatbot

Hey there! I’ve been using Rasa to develop my chatbot for a couple of months now. I am very happy with how it has turned out so far (Special thanks to all those extraordinary community heroes!)

I would like to improve the functionality of the chatbot by integrating some AI functionalities to it, with the sole purpose of making it more intelligent.

For instance, my first requirement would be to improve its prompt understanding, especially when dealing with user input that may contain spelling mistakes, slang, or other variations (unexpected/beyond the ones which it has already been trained on). If there is anyone who has implemented this or have an idea as to how this can be done, please share your thoughts on this.

Thank you so much :slight_smile:

Hi @muzzammil - great to hear you’re having success with Rasa!

To answer your questions about making the language understanding more robust, I’d have to understand:

  • Are you using an NLU model or working with the CommandGenerator in CALM?
  • What does your config.yml look like?
  • What does your training data look like? (if NLU)
  • What are some of the things your bot currently doesn’t understand?

Hi @amn41 - thank you so much for taking the time to respond to my query!

I’ve tried to answer all of your above questions as best as I could, please let me know if I should provide any more additional details to resolve my query.

  1. I am using a NLU model. The versions that are currently been used for my chatbot are:

Rasa Version: 3.6.15
SDK Version: 3.6.2
Python Version: 3.9.0

  1. Here is my config.yml at the moment:
language: en

pipeline:

  - name: SpacyNLP
    model: en_core_web_md
    case_sensitive: False
  - name: SpacyTokenizer
    intent_tokenization_flag: True
    intent_split_symbol: " "
  # - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: SpacyFeaturizer
    pooling: mean
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 200
    constrain_similarities: true
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 200
    constrain_similarities: true
  
#   - name: FallbackClassifier
#     threshold: 0.3
#     ambiguity_threshold: 0.1

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies: 

  - name: MemoizationPolicy
  - name: RulePolicy
#   - name: UnexpecTEDIntentPolicy
#     max_history: 5
#     epochs: 100
  - name: TEDPolicy
    max_history: 5
    epochs: 200
    constrain_similarities: true
  1. Here is a similar sample data I have modified as I unfortunately cannot provide the exact training data as it contains sensitive information for the company (Hope you can understand, this is a direct modification, so the structure is unchanged so that you get a better understanding. Hope this helps):
- intent: select_country
  examples: |
    - [INDIA](ecountry)
    - [UNITED KINGDOM](ecountry)
    - [SOUTH AFRICA](ecountry)
    - [Select all locations](ecountry)
    - [UNITED KINGDOM](ecountry), [INDIA](ecountry)
    - [UNITED KINGDOM](ecountry), [SOUTH AFRICA](ecountry)
    - [INDIA](ecountry), [SOUTH AFRICA](ecountry)
    - [UNITED KINGDOM](ecountry), [INDIA](ecountry), [SOUTH AFRICA](ecountry)
    - [INDIA](ecountry), [SOUTH AFRICA](ecountry)

- synonym: Select all locations
  examples: |
    - all the locations
    - All locations
    - all countries
    - all locations
    - all places
    - select all locations
    - all of the locations
    - all location
    - all country
    - every branch
    - every location

- intent: select_data_hierarchy_level1
  examples: |
    - [Level One](edata_hierarchy_level1)
    - [Level One](edata_hierarchy_level1) hierarchy level
    - Open [Level One](edata_hierarchy_level1)

- intent: select_sublevel_of_data_hierarchy_level1
  examples: |
    - [Sub Level 1](esublevel_of_data_hierarchy_level1)
    - [Sub Level 2](esublevel_of_data_hierarchy_level1)
    - [Sub Level 3](esublevel_of_data_hierarchy_level1)

- intent: select_data_in_level1_sublevel_1
  examples: |
    - Display details for [data](data_in_level1_sublevel_1)
    - I am looking for a [data](edata_in_level1_sublevel_1)
    - Can you give me details for [data](data_in_level1_sublevel_1)

- intent: select_data_in_level1_sublevel_2
  examples: |
    - Display details for [data](data_in_level1_sublevel_2)
    - I am looking for a [data](edata_in_level1_sublevel_2)
    - Can you give me details for [data](data_in_level1_sublevel_2)

So the data has been trained so forth for Different Levels and their respective sub levels^

  1. My main issue is occurring when the user makes a spelling error in the entity itself. Although I provide synonyms as much as I can, there is only so much that I can predict where the user may make a spelling error.

So, when the spelling mistake occurs → the trained entity is not recognized and is mapped incorrectly → Data retrieval from the database doesn’t happen (as there is no match) → Final output is not displayed correctly.

Conclusion: This is where I hope I could make the Understanding of the bot more robust.