SpacyEntityExtractor extracts additionally special characters

hi guys, I am using SpacyEntityExtractor to extract locations from text. This works very well, but if the entity is at the end of the sentence and a special character is used, the special character is appended to the entity. I hope you can help me :smile:


  - name: SpacyNLP
    model: de_core_news_lg
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
  - name: SpacyEntityExtractor
    dimensions: ["PER", "LOC"]
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    entity_recognition: True
    epochs: 300
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 200
    retrieval_intent: faq
  - name: ResponseSelector
    epochs: 200
    retrieval_intent: selfservice
  - name: ResponseSelector
    epochs: 200
    retrieval_intent: chitchat
  - name: FallbackClassifier
    threshold: 0.5

  - name: MemoizationPolicy
  - name: RulePolicy
  - name: UnexpecTEDIntentPolicy
    max_history: 5
    epochs: 200
  - name: TEDPolicy
    max_history: 5
    epochs: 200
    constrain_similarities: true



    type: text
    - type: from_entity
      entity: LOC

Versions: Rasa: 3.3.1 Python: 3.8 OS: Linux

Ah I got it. I forgot to mark the examples with the necessery entity in the nlu.yml

my solution:

  examples: |
    - Wie wird das Wetter in [Berlin]{"entity": "LOC", "value": "Berlin"}
    - Wird es heute warm in [New York]{"entity": "LOC", "value": "Köln"}
    - Was sagt das Wetter heute in [Peking]{"entity": "LOC", "value": "Peking"}