I used rasa bert to train a model, but it can’t handle the singular and plural of words, as in the following two sentences:
A:How to search in notes
B:How to search in note
These two sentences are only different from “note”, but the predicted intention is completely different. I got the vectors of A and B by debugging the code. As follows:
I compared the similarity between A-batch_sentence_features and B-batch_sentence_features, and it can reach a similarity of more than 95%, but why is the recognized intention different? I need to solve this problem urgently, please help me, thank you!
# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en
pipeline:
# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See https://rasa.com/docs/rasa/tuning-your-model for more information.
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: LanguageModelFeaturizer
model_name: "bert"
model_weights: "bert-base-uncased"
cache_dir: null
- name: DIETClassifier
epochs: 100
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
- name: FallbackClassifier
threshold: 0.8
ambiguity_threshold: 0.1
# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
# # No configuration for policies was provided. The following default policies were used to train your model.
# # If you'd like to customize them, uncomment and adjust the policies.
# # See https://rasa.com/docs/rasa/policies for more information.
- name: AugmentedMemoizationPolicy
- name: TEDPolicy
max_history: 8
epochs: 200
hidden_layers_sizes:
dialogue: [256, 128]
- name: RulePolicy
core_fallback_threshold: 0.3
core_fallback_action_name: "action_default_fallback"
enable_fallback_prediction: True
Hi @tyd We have 51 intentions, and it is data on the production environment. I’m sorry that we can’t make it public.
B sentence is in the training data, A is not in the training data, the result of the current model is that B can be correctly classified, A can not be correctly classified, and a is recognized as other_intent.The intention of a and B should be consistent,. And the semantic similarity between the training data in this “other_intent” and A is less than 60% . So now I don’t understand why dietclassifier mistakenly classifies A.