NLU training data issue

I am newbie to RASA and exploring NLU module of RASA. I am trying to train a model with below sample data but RASA is throwing warnings as provided. Can anybody please suggest anything wrong here.

Pipeline:

language: “en”

pipeline:

  • name: “SpacyNLP”
  • name: “SpacyTokenizer”
  • name: “RegexFeaturizer”
  • name: “SpacyFeaturizer”
  • name: “CRFEntityExtractor”
  • name: “EntitySynonymMapper”
  • name: “SklearnIntentClassifier”

Training data:

{ “rasa_nlu_data”: { “lookup_tables”: [], “common_examples”: [ { “text”: “cancellation fee for waiver code TEST12”, “intent”: “extract_waiver_code”, “entities”: [ { “start”: 59, “end”: 63, “entity”: “waiver_code”, “value”: “TEST12” } ] }, { “text”: “Waiver code 12TEST to waive change fee only”, “intent”: “extract_waiver_code”, “entities”: [ { “start”: 38, “end”: 42, “entity”: “waiver_code”, “value”: “12TEST” } ] }, { “text”: “XYZ is waiving change fees due to tropical storm Sally”, “intent”: “waiver_reason”, “entities”: [ { “start”: 71, “end”: 76, “entity”: “reason_name”, “value”: “storm” } ] }, { “text”: “ABC is waiving change fees due to storm Sally”, “intent”: “waiver_reason”, “entities”: [ { “start”: 71, “end”: 76, “entity”: “reason_name”, “value”: “storm” } ] } ] } }

RASA NLU training warnigns/errors: /opt/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Misaligned entity annotation in message ‘cancellation fee for waiver code TEST12’ with intent ‘extract_waiver_code’. Make sure the start and end values of entities in the training data match the token boundaries (e.g. entities don’t include trailing whitespaces or punctuation). More info at https://rasa.com/docs/rasa/nlu/training-data-format/ /opt/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Misaligned entity annotation in message ‘Waiver code 12TEST to waive change fee only’ with intent ‘extract_waiver_code’. Make sure the start and end values of entities in the training data match the token boundaries (e.g. entities don’t include trailing whitespaces or punctuation). More info at https://rasa.com/docs/rasa/nlu/training-data-format/ /opt/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Misaligned entity annotation in message ‘XYZ is waiving change fees due to tropical storm Sally’ with intent ‘waiver_reason’. Make sure the start and end values of entities in the training data match the token boundaries (e.g. entities don’t include trailing whitespaces or punctuation). More info at https://rasa.com/docs/rasa/nlu/training-data-format/ /opt/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Misaligned entity annotation in message ‘ABC is waiving change fees due to storm Sally’ with intent ‘waiver_reason’. Make sure the start and end values of entities in the training data match the token boundaries (e.g. entities don’t include trailing whitespaces or punctuation). More info at https://rasa.com/docs/rasa/nlu/training-data-format/ 2020-10-16 00:08:38 INFO rasa.nlu.model - Finished training component. 2020-10-16 00:08:38 INFO rasa.nlu.model - Starting to train component EntitySynonymMapper /opt/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Found conflicting synonym definitions for ‘’. Overwriting target ‘TEST12’ with ‘storm’. Check your training data and remove conflicting synonym definitions to prevent this from happening. More info at https://rasa.com/docs/rasa/nlu/training-data-format/#entity-synonyms

it means that start and end indices of your entities don’t align with word boundaries

1 Like

Thanks.its working after correcting the token positions.