Why rasa bert cannot handle singular and plural words?

gongshaojie12 · June 9, 2021, 1:42pm

I used rasa bert to train a model, but it can’t handle the singular and plural of words, as in the following two sentences: A:How to search in notes B:How to search in note These two sentences are only different from “note”, but the predicted intention is completely different. I got the vectors of A and B by debugging the code. As follows:

A-batch_sequence_features.0.txt (74.7 KB) A-batch_sentence_features.txt (8.5 KB)

B-batch_sentence_features.txt (8.6 KB) B-batch_sequence_features.0.txt (74.7 KB)

I compared the similarity between A-batch_sentence_features and B-batch_sentence_features, and it can reach a similarity of more than 95%, but why is the recognized intention different? I need to solve this problem urgently, please help me, thank you!

harloc · June 10, 2021, 5:51am

Without knowing your pipeline and the intents combined with some examples, nobody will be able to give you a good answer.

gongshaojie12 · June 10, 2021, 8:08am

Hi @harloc My pipeline is:

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en

pipeline:
# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See https://rasa.com/docs/rasa/tuning-your-model for more information.
   - name: WhitespaceTokenizer
   - name: RegexFeaturizer
   - name: LexicalSyntacticFeaturizer
   - name: LanguageModelFeaturizer
     model_name: "bert"
     model_weights: "bert-base-uncased"
     cache_dir: null
   - name: DIETClassifier
     epochs: 100
   - name: EntitySynonymMapper
   - name: ResponseSelector
     epochs: 100
   - name: FallbackClassifier
     threshold: 0.8
     ambiguity_threshold: 0.1


# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
# # No configuration for policies was provided. The following default policies were used to train your model.
# # If you'd like to customize them, uncomment and adjust the policies.
# # See https://rasa.com/docs/rasa/policies for more information.
   - name: AugmentedMemoizationPolicy
   - name: TEDPolicy
     max_history: 8
     epochs: 200
     hidden_layers_sizes:
       dialogue: [256, 128]
   - name: RulePolicy
     core_fallback_threshold: 0.3
     core_fallback_action_name: "action_default_fallback"
     enable_fallback_prediction: True

tyd · June 10, 2021, 12:42pm

@gongshaojie12 We will also need the intents and some training examples for each

gongshaojie12 · June 10, 2021, 1:45pm

Hi @tyd We have 51 intentions, and it is data on the production environment. I’m sorry that we can’t make it public.

B sentence is in the training data, A is not in the training data, the result of the current model is that B can be correctly classified, A can not be correctly classified, and a is recognized as other_intent.The intention of a and B should be consistent,. And the semantic similarity between the training data in this “other_intent” and A is less than 60% . So now I don’t understand why dietclassifier mistakenly classifies A.

Topic		Replies	Views
RASA ConveRT and Semantic Similarity issues Rasa Open Source	6	1239	March 10, 2020
Intents not as per expectation Tutorials, Resources & Videos rasa , intents	0	434	February 23, 2021
RASA NLU pretrained embeddings not detecting intent even if the input is straight forward Rasa Open Source	8	973	September 23, 2019
Rasa_nlu returns intent as null for exact copy of text given during training Rasa Open Source	2	891	October 10, 2018
Rasa classifies random input as intents with high probability Rasa Open Source	24	1689	April 20, 2023

Why rasa bert cannot handle singular and plural words?

Related topics