RASA NLU Training Data - Entity Position

Hi Team,

While training RASA NLU, why do we need start and end positions of the entities in training data when we are already specifying the entity value. { “text”: “What’s the weather in Berlin at the moment?”, “intent”: “inform”, “entities”: [ { “start”: 22, “end”: 28, “value”: “Berlin”, “entity”: “location” }

As in the example below, we are telling the model that value is Berlin, why do we need start and end here. Is it not redundant.

Hello @gopeekrishnan. Sorry for a late response on this. The start and end positions are important because this is how the model knows which characters to extract and use to train the model. The value of the entity can be different from the one in the original message. For example, if you want to train you NLU model to know that New York City and NYC is the same, your training data would look like the example below:

  "text": "I moved to New York City",
  "intent": "inform_relocation",
  "entities": [{"value": "nyc",
                "start": 11,
                "end": 24,
                "entity": "city",
               }]
},
{
  "text": "I got a new flat in NYC.",
  "intent": "inform_relocation",
  "entities": [{"value": "nyc",
                "start": 20,
                "end": 23,
                "entity": "city",
               }]
}]