Facing issue in identifying multiple Entity Extraction ( Model lacks generalization) using Rasa/LaBSE

I am facing some issues in identifying multiple Entity Extraction using Rasa/LaBSE DIET classifier

pipeline:

  • name: LanguageModelTokenizer

  • name: RegexFeaturizer

  • name: LanguageModelFeaturizer

    model_name: “bert”

    model_weights: “rasa/LaBSE”

  • name: DIETClassifier epochs: 50

And is trained as following format: I want to travel from [Bombay]{“entity”: “srccity”,“start”: “22”,“end”: “28”,“value”: “Bombay”} to [Bangalore]{“entity”: “destcity”,“start”: “32”,“end”: “41”,“value”: “Bangalore”}\n

The model is not able to generalise as both the entity is getting identified as source or Destination (Eg: I want to travel from Bangalore to London), the model is capturing both as destination city… Am i missing anything?? Need your suggestions please

I doubt that LaBSE is the main culprit here. Just to check, how many data points are you using here? Also, just to confirm, did you consider using roles and groups for this feature?

Also, 50 epochs for DIET is not a whole lot. You should try increasing to more like 200-300