CRFEntityExtractor or DIETClassifier splits one entity into multiple words

I updated from RASA 1.10.2 to 1.10.14.

My pipeline is -:

language: en
pipeline:
  - name: SpacyNLP
    model: en_core_web_md
  - name: ConveRTTokenizer
  - name: ConveRTFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CRFEntityExtractor
  - name: DIETClassifier
    epochs: 50
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
  - name: SpacyEntityExtractor
  - name: "DucklingHTTPExtractor"
    url: "http://0.0.0.0:8000"
    locale: "en_GB"
    timezone: "US/Pacific"
    timeout : 3

In RASA 1.10.2 CRFEntityExtractor was working fine but in 1.10.14 it is splitting the entity into multiple tokens.

Pls help

If more information for nlu data is required will provide.

Thanks!

Looking for a help @RASA Team.

check this post: Introducing DIET: state-of-the-art architecture that outperforms fine-tuning BERT and is 6X faster to train

I think there’s an issue by using spacy together with convert. They each come with their own tokenizer and I can imagine that they’re not playing nice with eachother. I also wonder, is there a particular reason you’re using the spaCy entity extractor? Are there specific entities that are pretrained in en_core_web_md that you’re interested in?