CRFEntityExtractor or DIETClassifier splits one entity into multiple words

ridhimagarg · September 24, 2020, 1:11pm

I updated from RASA 1.10.2 to 1.10.14.

My pipeline is -:

language: en
pipeline:
  - name: SpacyNLP
    model: en_core_web_md
  - name: ConveRTTokenizer
  - name: ConveRTFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CRFEntityExtractor
  - name: DIETClassifier
    epochs: 50
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
  - name: SpacyEntityExtractor
  - name: "DucklingHTTPExtractor"
    url: "http://0.0.0.0:8000"
    locale: "en_GB"
    timezone: "US/Pacific"
    timeout : 3

In RASA 1.10.2 CRFEntityExtractor was working fine but in 1.10.14 it is splitting the entity into multiple tokens.

Pls help

If more information for nlu data is required will provide.

Thanks!

ridhimagarg · September 25, 2020, 6:27am

Looking for a help @RASA Team.

UlisesVD · September 25, 2020, 5:52pm

check this post: Introducing DIET: state-of-the-art architecture that outperforms fine-tuning BERT and is 6X faster to train

koaning · October 5, 2020, 7:49am

I think there’s an issue by using spacy together with convert. They each come with their own tokenizer and I can imagine that they’re not playing nice with eachother. I also wonder, is there a particular reason you’re using the spaCy entity extractor? Are there specific entities that are pretrained in en_core_web_md that you’re interested in?

Topic		Replies	Views
DIET Classifier extracting same entity twice Rasa Open Source	1	530	May 7, 2023
Problem with using two different entity extractors Rasa Open Source	3	462	September 24, 2020
Clarification regarding NLU Pipeline and DIETClassifier Rasa Open Source	4	1580	March 4, 2021
Using the CRFEntityExtractor with the DIETClassifier Rasa Open Source	16	5499	July 22, 2024
Issue on Multiple Entity Extractions with Spacy extractor and diet classifier Rasa Open Source	6	1031	January 17, 2022

CRFEntityExtractor or DIETClassifier splits one entity into multiple words

Related topics