Understand synonym, pipeline component, entities and nlu extraction

Hello,

First sorry for my english (I’m french :wink: ).

I have the latest configuration (Rasa Version:2.8.4) and I don’t understand how synonym and entities extraction works with my configuration in the pipeline. I read all of the documentation, testing differents configurations but I don’t have the result expected.

Here, are my differents files (I post only the config and the others files are attached because they are too long) :

config.yml

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: fr

pipeline:
  - name: SpacyNLP
    model: fr_core_news_lg
    case_sensitive: False
  - name: SpacyTokenizer
    intent_tokenization_flag: False
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
    case_sensitive: False
  - name: LexicalSyntacticFeaturizer
    features: [
      [ "low", "title", "upper" ],
      [ "BOS", "EOS", "low", "upper", "title", "digit", "pos"],
      [ "low", "title", "upper" ],
    ]
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
    use_shared_vocab: False
  - name: SpacyEntityExtractor
    dimensions: ["PER"] #https://miro.medium.com/max/700/1*lJ1hSNVyCNGMG-cPiCeU1g.png //"LOC", "ORG", "MISC"
  - name: EntitySynonymMapper
  - name: CRFEntityExtractor
    BILOU_flag: True
    features: [
      ["low", "title", "upper"],
      ["bias", "low", "prefix5", "prefix2", "suffix5",
       "suffix3", "suffix2", "upper", "title", "digit",
       "pattern", "pos", "pos2"],
      ["low", "title", "upper"]
    ]
    featurizers: []
    split_entities_by_comma:
      special_type: False
      any_information: False
  - name: EntitySynonymMapper
  - name: RegexEntityExtractor
    case_sensitive: False
    use_lookup_tables: True
    use_regexes: False
  - name: EntitySynonymMapper
  - name: SklearnIntentClassifier
  - name: ResponseSelector
    epochs: 100
    retrieval_intent: chitchat
    constrain_similarities: true
    model_confidence: softmax
  - name: FallbackClassifier
    threshold: 0.3
    ambiguity_threshold: 0.01
  - name: "DucklingEntityExtractor"
    url: "http://172.17.0.2:8000/"
    dimensions: ["distance", "duration", "numeral", "ordinal", "quantity", "temperature", "time", "volume"] #https://github.com/facebook/duckling/blob/master/Duckling/Dimensions/FR.hs
    locale: "fr_FR"
    timezone: "Europe/Paris"
    timeout: 0.2
# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See https://rasa.com/docs/rasa/tuning-your-model for more information.
#   - name: WhitespaceTokenizer
#   - name: RegexFeaturizer
#   - name: LexicalSyntacticFeaturizer
#   - name: CountVectorsFeaturizer
#   - name: CountVectorsFeaturizer
#     analyzer: char_wb
#     min_ngram: 1
#     max_ngram: 4
#   - name: DIETClassifier
#     epochs: 100
#   - name: EntitySynonymMapper
#   - name: ResponseSelector
#     epochs: 100
#   - name: FallbackClassifier
#     threshold: 0.3
#     ambiguity_threshold: 0.1

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
  # No configuration for policies was provided. The following default policies were used to train your model.
  # If you'd like to customize them, uncomment and adjust the policies.
  # See https://rasa.com/docs/rasa/policies for more information.
  - name: MemoizationPolicy #AugmentedMemoizationPolicy
    max_history: 0
    priority: 6
  - name: TEDPolicy
    max_history: 0
    epochs: 200
    split_entities_by_comma:
      special_type: False
      any_information: False
    constrain_similarities: true
    model_confidence: softmax
    entity_recognition: False
  - name: RulePolicy
    priority: 3
    nlu_threshold_fallback: 0.3
    core_threshold_fallback: 0.3
    ambiguity_threshold: 0.01
    core_fallback_action_name: "action_default_fallback"
    enable_fallback_prediction: True

nlu.yml (447.1 KB) rules.yml (726 Bytes) stories.yml (5.9 KB) cities.yml (88.1 KB) domain.yml (27.0 KB) actions.py (1.7 KB)

So I am sorry, my files are in french but it doesn’t matter if you don’t understand French.

My problem is when I said to my Chatbot :

Question : Quel est le planning

Answer : utter_activities

Question : Quel est le programme des activitiés

Answer : utter_activities

But if I said :

Question : Quel est le programme

Answer :

Why my synonym “planning” doesn’t match in the last case ? And why I got and not the utter_defaut answer ?

  - synonym: planning
    examples: |
      - planning des activités
      - programme
      - programme des activités

Another thing I don’t understand is for the word before and after an entity. For example, if I said :

Question : Quelle est la météo à Chambéry

Answer : utter_weather with city = eyJncm91cCI6ImNpdHkiLCJwYXJhbXMiOnsiY2l0eSI6IkNoYW1iw6lyeSJ9fQ==

Question : Quelle est la météo de Chambéry

Answer : utter_weather with city = eyJncm91cCI6ImNpdHkiLCJwYXJhbXMiOnsiY2l0eSI6IkNoYW1iw6lyeSJ9fQ==

But if I said :

Question : Quelle est la météo à Annecy

Answer : utter_weather with city = eyJncm91cCI6ImNpdHkiLCJwYXJhbXMiOnsiY2l0eSI6IkFubmVjeSJ9fQ==

Question : Quelle est la météo de Annecy

Answer : utter_weather with city = None

Question : Quelle est la météo de Annecy stp

Answer : utter_weather with city = eyJncm91cCI6ImNpdHkiLCJwYXJhbXMiOnsiY2l0eSI6IkFubmVjeSJ9fQ==

It’s very strange ?

Nota : the key are encoded in base 64 ({“group”:“city”,“params”:{“city”:“Annecy”}} for exemple)

  - synonym: stp
    examples: |
      - s'il te plaît

Same thing, for my synonym “stp”, why when I said :

Question : Quelle est la météo à Chambéry stp

Answer : utter_weather with city = eyJncm91cCI6ImNpdHkiLCJwYXJhbXMiOnsiY2l0eSI6IkNoYW1iw6lyeSJ9fQ==

But if I said :

Question : Quelle est la météo à Chambéry s’il te plaît

Answer : utter_default

This is the output of my console for this example :

2021-09-10 17:59:09 DEBUG    rasa.core.lock_store  - Issuing ticket for conversation 'test_user'.
2021-09-10 17:59:09 DEBUG    rasa.core.lock_store  - Acquiring lock for conversation 'test_user'.
2021-09-10 17:59:09 DEBUG    rasa.core.lock_store  - Acquired lock for conversation 'test_user'.
2021-09-10 17:59:09 DEBUG    rasa.core.tracker_store  - Recreating tracker for id 'test_user'
2021-09-10 17:59:09 DEBUG    rasa.nlu.selectors.response_selector  - Adding following selector key to message property: chitchat
2021-09-10 17:59:09 DEBUG    urllib3.connectionpool  - Starting new HTTP connection (1): 172.17.0.2:8000
2021-09-10 17:59:09 DEBUG    urllib3.connectionpool  - http://172.17.0.2:8000 "POST /parse HTTP/1.1" 200 None
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Received user message 'quelle est la météo à Chambéry s'il te plaît' with intent '{'name': 'say_special_type', 'confidence': 0.9999999602720443}' and entities '[{'entity': 'city', 'start': 22, 'end': 30, 'confidence_entity': 0.8812001804963028, 'value': 'eyJncm91cCI6ImNpdHkiLCJwYXJhbXMiOnsiY2l0eSI6IkNoYW1iw6lyeSJ9fQ==', 'extractor': 'CRFEntityExtractor', 'processors': ['EntitySynonymMapper']}, {'entity': 'special_type', 'start': 11, 'end': 19, 'confidence_entity': 0.9991491902140056, 'value': 'type_weather', 'extractor': 'CRFEntityExtractor', 'processors': ['EntitySynonymMapper']}, {'entity': 'special_type', 'start': 34, 'end': 42, 'confidence_entity': 0.5589575690818338, 'value': 'te plaît', 'extractor': 'CRFEntityExtractor'}]'
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Current slot values:
        PER: None
        city: eyJncm91cCI6ImNpdHkiLCJwYXJhbXMiOnsiY2l0eSI6IkNoYW1iw6lyeSJ9fQ==
        time: None
        any_information: None
        special_type: te plaît
        session_started_metadata: None
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Logged UserUtterance - tracker now has 152 events.
2021-09-10 17:59:09 DEBUG    rasa.core.policies.memoization  - Current tracker state:
[state 0] slots: {'any_information': (1.0,), 'special_type': (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)}
[state 1] user intent: say_special_type | user entities: ('city', 'special_type') | previous action name: action_listen | slots: {'any_information': (1.0,), 'special_type': (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)}
2021-09-10 17:59:09 DEBUG    rasa.core.policies.memoization  - There is no memorised next action
2021-09-10 17:59:09 DEBUG    rasa.core.policies.ted_policy  - TED predicted 'utter_weather' based on user intent.
2021-09-10 17:59:09 DEBUG    rasa.core.policies.rule_policy  - Current tracker state:
[state 0] slots: {'any_information': (1.0,), 'special_type': (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)}
[state 1] user text: quelle est la météo à Chambéry s'il te plaît | previous action name: action_listen | slots: {'any_information': (1.0,), 'special_type': (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)}
2021-09-10 17:59:09 DEBUG    rasa.core.policies.rule_policy  - There is no applicable rule.
2021-09-10 17:59:09 DEBUG    rasa.core.policies.rule_policy  - Current tracker state:
[state 0] slots: {'any_information': (1.0,), 'special_type': (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)}
[state 1] user intent: say_special_type | user entities: ('city', 'special_type') | previous action name: action_listen | slots: {'any_information': (1.0,), 'special_type': (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)}
2021-09-10 17:59:09 DEBUG    rasa.core.policies.rule_policy  - There is no applicable rule.
2021-09-10 17:59:09 DEBUG    rasa.core.policies.ensemble  - Made prediction using user intent.
2021-09-10 17:59:09 DEBUG    rasa.core.policies.ensemble  - Added `DefinePrevUserUtteredFeaturization(False)` event.
2021-09-10 17:59:09 DEBUG    rasa.core.policies.ensemble  - Predicted next action using policy_2_RulePolicy.
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Predicted next action 'action_default_fallback' with confidence 0.30.
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Policy prediction ended with events '[<rasa.shared.core.events.DefinePrevUserUtteredFeaturization object at 0x7f7e0c31da20>]'.
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Action 'action_default_fallback' ended with events '[BotUttered('None', {"elements": null, "quick_replies": null, "buttons": null, "attachment": null, "image": null, "custom": {"blocks": [{"type": {"name": "not_understood", "parameters": {"pepperSpeech": [{"pepperText": {"say": "Je n'arrive pas \u00e0 comprendre ce que vous voulez faire. Pourriez-vous reformuler ?"}}, {"pepperText": {"say": "Je suis d\u00e9sol\u00e9 mais je ne comprends pas votre demande. Pourriez-vous reformuler ?"}}, {"pepperText": {"say": "D\u00e9sol\u00e9 je ne peux pas r\u00e9pondre \u00e0 cette demande."}}]}}}]}}, {"utter_action": "utter_default"}, 1631289549.4567022), <rasa.shared.core.events.UserUtteranceReverted object at 0x7f7dcc45acc0>]'.
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Current slot values:
        PER: None
        city: None
        time: None
        any_information: None
        special_type: __other__
        session_started_metadata: None
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Predicted next action 'action_listen' with confidence 1.00.
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Policy prediction ended with events '[]'.
2021-09-10 17:59:09 DEBUG    rasa.core.processor  - Action 'action_listen' ended with events '[]'.
2021-09-10 17:59:09 DEBUG    rasa.core.lock_store  - Deleted lock for conversation 'test_user'.

And one more thing,I have tested with DietClassifier but the result isn’t better. And my command to train and run is :

rasa train --augmentation 0 --force --debug && rasa run --debug -p 5004

Thank you for your help.

Arnaud

@NQArnaud Heya! Well your use case and topic covered all the basic implementation for NLU and NLP :wink: But, Let me try help you, but for that you need to see some videos and GitHub repo for the code, I hope that is fine with you?

For Entities and Synonyms:

Please see this latest rasa series videos for entities and synonyms explanation: Conversational AI with Rasa: Entities - YouTube

For Slots: Please see this for slot: Conversational AI with Rasa: Slots - YouTube

For Pipeline: As you are developing a chatbot in fr so yes you need to customise the configuration as per your use case, but make sure only mention the pipeline which are best fitted for your use case. Please see this video: Conversational AI with Rasa: Pipeline and Policy Configuration - YouTube

For Weather API: Please see this video : Calling Weather API in Rasa | Part - 1 | - YouTube | Calling Weather API in Rasa | Part - 2 - YouTube

Complete series and tutorial link : I would encourage please see all these videos, it will help you a lot Conversational AI with Rasa - YouTube

Some of the projects for your reference: https://github.com/RasaHQ/rasa/tree/main/examples

For Free Udemy course : https://www.udemy.com/course/rasa-for-beginners/

I hope this will help you to archived your goal and soltuion, as French is not my first language.

Hello Nik,

Thanks you very much for tour answer and all these links. I will see all of them and come back later.

Thanks

@NQArnaud No worries. When you come whilst seeing all please close this thread as as a solution for others and good luck!