Hi Everybody,
I am still new to Rasa (and the forum as well), but try to explain my problem as clearly as possible, sorry for the long post…
I would like to distinguish intents, where the text are similar, but one of them has entities. I put together a simple example (part of a bigger project):
nlu:
- intent: acquaintance
examples: |
- Who are you?
- What are you?
- intent: boss
examples: |
- Who is your boss?
- Who is your master?
- Who is your owner?
- Who is the boss?
- intent: famous
examples: |
- Who is (PERSON)?
- Who is [](PERSON)?
- Who is [Arnold Schwarzenegger](PERSON)?
- Who is [Michael Jackson](PERSON)?
- Who is [Albert Einstein](PERSON)?
The config is the following:
language: en
pipeline:
- name: SpacyNLP model: “en_core_web_lg” case_sensitive: False
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
- name: DIETClassifier epochs: 100 constrain_similarities: true model_confidence: linear_norm
- name: SpacyEntityExtractor dimensions: [“PERSON”]
- name: EntitySynonymMapper
- name: ResponseSelector epochs: 100 retrieval_intent: acquaintance constrain_similarities: true model_confidence: linear_norm
- name: ResponseSelector epochs: 100 retrieval_intent: boss constrain_similarities: true model_confidence: linear_norm
- name: ResponseSelector epochs: 100 retrieval_intent: famous constrain_similarities: true model_confidence: linear_norm
- name: FallbackClassifier threshold: 0.7 ambiguity_threshold: 0.1
- name: MemoizationPolicy
- name: TEDPolicy max_history: 5 epochs: 100 constrain_similarities: true model_confidence: linear_norm
- name: RulePolicy
What I would like to achieve, is the following: if an intent contains a PERSON entity, it should be classified az “famous”, otherwise it can be either “acquaintance” or “boss”. I do not know, if my nlu intent examples are wrong, or the pipeline has problems, or something else, but when I am playing with ‘rasa shell nlu’, I get the following results:
Message #1: “Who was George Washington?” NLU result OK - intent: famous, confidence: 1.0, entity extracted (both by DIETClassifier and SpacyEntityExtractor - I know this is not really recommended this way…)
Message #2: “Who is Elsa?” NLU result OK - intent: famous, confidence: 0.84, entity extracted (by Spacy)
Message #3: “Who is Mozart?” NLU result not really ok - intent: nlu_fallback, entity extracted (by Spacy) (intent famous confidence: 0.69 - not that bad, but still, do not really understand, why this is the result)
Message #4: “Who is Freddie Mercury?” NLU result BAD - intent: boss, confidence: 1.0, entity not extracted This is not good, but I can accept, if there is no entity, the classification can go wrong, but can I do something here?
Message #5: “Who is Steve Buscemi?” NLU result VERY BAD - intent: boss, confidence: 0.74, entity extracted (by Spacy), intent famous confidence is 0.25 This I cannot understand at all. We have an entity extracted, why it cannot help classify the intent better?
So what am I doing wrong, what shall I do?
Thank you and best regards, Csaba