Detect entities that are not included in training data

oliverblane · June 29, 2022, 12:15pm

I am building a chatbot that searches a content library using an entity “search_string” extracted from user input. However, the chatbot only seems to be able to extract entities if they are included in the training data. I do not want to use a lookup table, since “search_string” should be able to be anything.

My hope is that the chatbot will be able to detect “search_string” regardless of whether it is included in the training data.

My nlu.yml looks like:

- intent: search_content
  examples: |
    - i want to search content for [maths](search_string)
    - search content for [english](search_string)
    - content for [lesson planning](search_string)
    - find me content for [classroom management][search_string] online
    - get content results for [equality](search_string)
    - search for [geography](search_string) content
    - i need content for [science](search_string)
    - i want [wellbeing](search_string) content
    - get [assessment](search_string) content
    - get content for [languages](search_string)
    - find [behaviour management](search_string) content
    - [digital literacy](search_string) content
    - content for [english and maths]
    - search content for [communication](search_string)
    - content for [planning](search_string)
    - find content for [inclusion](search_string)
    - i need [remote and online learning](search_string) content
    - i want [blended learning](search_string) content
    - get content for [maths](search_string)
    - i need [flipped learning](search_string)[content] content
    - do you have any [english](search_string) content

I also have a “search_string” slot defined in domain.yml:

search_string:
  type: text
  influence_conversation: true
  auto_fill: true

Finally, my config.yml file is unchanged from default and looks like:

pipeline:
# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See https://rasa.com/docs/rasa/tuning-your-model for more information.
#   - name: WhitespaceTokenizer
#   - name: RegexFeaturizer
#   - name: LexicalSyntacticFeaturizer
#   - name: CountVectorsFeaturizer
#   - name: CountVectorsFeaturizer
#     analyzer: char_wb
#     min_ngram: 1
#     max_ngram: 4
#   - name: DIETClassifier
#     epochs: 100
#     constrain_similarities: true
#   - name: EntitySynonymMapper
#   - name: ResponseSelector
#     epochs: 100
#     constrain_similarities: true
#   - name: FallbackClassifier
#     threshold: 0.3
#     ambiguity_threshold: 0.1

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
# # No configuration for policies was provided. The following default policies were used to train your model.
# # If you'd like to customize them, uncomment and adjust the policies.
# # See https://rasa.com/docs/rasa/policies for more information.
#   - name: MemoizationPolicy
#   - name: RulePolicy
#   - name: UnexpecTEDIntentPolicy
#     max_history: 5
#     epochs: 100
#   - name: TEDPolicy
#     max_history: 5
#     epochs: 100
#     constrain_similarities: true

Thanks for any help on this!

Topic		Replies	Views
NLU only detecting entities explicitly present in training data Rasa Open Source	17	2783	August 8, 2021
How to detect entity values which are not part of Lookup table Rasa Open Source	3	436	December 2, 2022
Issue with entity detection - fails to detect outside of the training set Rasa Open Source	4	3106	February 6, 2019
How to account for entities not defined in nlu.yml Rasa Open Source	4	311	August 11, 2022
Entities are not being read Rasa Open Source	9	867	February 10, 2022

Detect entities that are not included in training data

Related topics