(edited: add NLU config)
Rasa version
Rasa Version : 2.8.16
Minimum Compatible Version: 2.8.9
Rasa SDK Version : 2.8.3
Rasa X Version : None
Python Version : 3.8.12
Operating System : Linux-5.4.0-90-generic-x86_64-with-glibc2.27
Full error message
UserWarning: Misaligned entity annotation in message 'I am .NET Developer' with intent 'int_provide_info'. Make sure the start and end values of entities ([(5, 19, '.NET Developer')]) in the training data match the token boundaries ([(0, 1, 'I'), (2, 4, 'am'), (6, 9, 'NET'), (10, 19, 'Developer')]). Common causes:
1) entities include trailing whitespaces or punctuation
2) the tokenizer gives an unexpected result, due to languages such as Chinese that don't use whitespace for word separation
More info at https://rasa.com/docs/rasa/training-data-format#nlu-training-data
NLU config
language: en
pipeline:
- name: LanguageModelTokenizer
- name: LanguageModelFeaturizer
model_weights: "distilbert-base-uncased"
model_name: "distilbert"
- name: RegexFeaturizer
"case_sensitive": False
- name: DIETClassifier
batch_strategy: balanced
epochs: 25
constrain_similarities: true
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
constrain_similarities: true
After training, my model can not successfully detect .Net Developer
as entity JOB
.
I have no idea how to fix the above warning, any suggestion is welcome