I am building a chatbot that searches a content library using an entity “search_string” extracted from user input. However, the chatbot only seems to be able to extract entities if they are included in the training data. I do not want to use a lookup table, since “search_string” should be able to be anything.
My hope is that the chatbot will be able to detect “search_string” regardless of whether it is included in the training data.
My nlu.yml
looks like:
- intent: search_content
examples: |
- i want to search content for [maths](search_string)
- search content for [english](search_string)
- content for [lesson planning](search_string)
- find me content for [classroom management][search_string] online
- get content results for [equality](search_string)
- search for [geography](search_string) content
- i need content for [science](search_string)
- i want [wellbeing](search_string) content
- get [assessment](search_string) content
- get content for [languages](search_string)
- find [behaviour management](search_string) content
- [digital literacy](search_string) content
- content for [english and maths]
- search content for [communication](search_string)
- content for [planning](search_string)
- find content for [inclusion](search_string)
- i need [remote and online learning](search_string) content
- i want [blended learning](search_string) content
- get content for [maths](search_string)
- i need [flipped learning](search_string)[content] content
- do you have any [english](search_string) content
I also have a “search_string” slot defined in domain.yml
:
search_string:
type: text
influence_conversation: true
auto_fill: true
Finally, my config.yml
file is unchanged from default and looks like:
pipeline:
# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See https://rasa.com/docs/rasa/tuning-your-model for more information.
# - name: WhitespaceTokenizer
# - name: RegexFeaturizer
# - name: LexicalSyntacticFeaturizer
# - name: CountVectorsFeaturizer
# - name: CountVectorsFeaturizer
# analyzer: char_wb
# min_ngram: 1
# max_ngram: 4
# - name: DIETClassifier
# epochs: 100
# constrain_similarities: true
# - name: EntitySynonymMapper
# - name: ResponseSelector
# epochs: 100
# constrain_similarities: true
# - name: FallbackClassifier
# threshold: 0.3
# ambiguity_threshold: 0.1
# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
# # No configuration for policies was provided. The following default policies were used to train your model.
# # If you'd like to customize them, uncomment and adjust the policies.
# # See https://rasa.com/docs/rasa/policies for more information.
# - name: MemoizationPolicy
# - name: RulePolicy
# - name: UnexpecTEDIntentPolicy
# max_history: 5
# epochs: 100
# - name: TEDPolicy
# max_history: 5
# epochs: 100
# constrain_similarities: true
Thanks for any help on this!