Hello Rasa community members,
I am trying to implement a use case like-
User: what’s the latest available docker image tag for rasa/rasa-sdk?
BOT: The latest available docker image for rasa/rasa-sdk is rasa/rasa-sdk:3.1.1
What are my options here for the pipeline config to make the entity extraction work generically?
Current Pipeline:
pipeline:
- name: SpacyNLP
model: "en_core_web_lg"
case_sensitive: false
- name: "SpacyTokenizer"
- name: "SpacyFeaturizer"
- name: "RegexFeaturizer"
- name: "LexicalSyntacticFeaturizer"
- name: "CountVectorsFeaturizer"
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: "DIETClassifier"
epochs: 100
- name: SpacyEntityExtractor
dimensions: ["PERSON"]
- name: FallbackClassifier
threshold: 0.4
ambiguity_threshold: 0.1
- name: "EntitySynonymMapper"
nlu.yaml sample example, I have tried with more than 20 examples…
- intent: get_latest_container_image
examples: |
- get me the latest docker image for [rasa](meta_name)/[rasa-sdk](image_name)
...
....
...
domain.yml has entities meta_name and image_name and the same slots which my custom action uses to call an API which does a docker search.
It works fine if the user ask for a image name that is already part of the nlu,yaml example.
User: What’s the latest docker image for rasa/frasa-sdk?
So rasa/rasa-sdk works fine, the BOT identifies rasa as the entity meta_name and rasa-sdk as image_name.
But if the user asks the BOT for any image which is not part of the example, the entity recognition fails,
User: What’s the latest docker image for rasa/financial-demo?
so for rasa/financial-demo, the entity for meta is correctly identified as rasa but the entity for image_name gets incorrectly identified as financial-.
As you see here, the entity value for image_name is getting split at the end using the - character, where as the correct value should be financial-demo. If I add this to the nlu intent example, it will work but again it won’t be generic and a bit redundant/impossible to add all possible docker image names!
What’s the best way to make this work generically? I’ve tried regex entity extraction and it did not yield the desired result. I tried few things like adding more examples, keeping only the minimum 2 examples so that the regex entity extractor and diet classifier don’t clash, still no luck.
Thank you!!!