DIETClassifier splits one entity into list of token

nadachaabani1 · November 25, 2020, 1:43pm

I am migrating from RASA 1.5.3 to 1.10.14

In the previous version entity extraction is working fine, but now the slot witch have a special character as / or - is splitted into list of token. For exemple, I want to extract IT / Engineering as a department entity but it’s extracted as a list ["it", "engineering"]

This is my NLU data

intent: inform

i am looking for IT / Engineering fields

i work for analyst / business strategy services industry in operations department

i am an engineer in Manufacturing / Aerospace

Department slot in the domain looks like:

slots:
  department:
    type: unfeaturized

And this is the pipeline I am using:

# Configuration for Rasa NLU.
language: en
pipeline:
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: DIETClassifier
  batch_strategy: sequence
  epochs: 90
  ranking_length: 5
- name: EntitySynonymMapper
- name: "ResponseSelector"
  retrieval_intent: zigzag
  scale_loss: false

I think that DIETClassifier component caused this error because when I remove it everything works fine. Which hyperparamater should I change to fix this problem please?

Arjaan · December 1, 2020, 6:17pm

Hi @nadachaabani1 ,

Please see the explanation of this behavior, and possible solutions, in this forum post.

nadachaabani1 · December 7, 2020, 9:44am

Thank you so much @Arjaan. Now I’am using SpacyTokenizer and everything works fine.

Topic		Replies	Views
CRFEntityExtractor or DIETClassifier splits one entity into multiple words Rasa Open Source	3	656	October 5, 2020
Double entity extraction using DIETClassifier & RegexEntityExtractor Rasa Open Source	4	1147	May 7, 2021
NLU not predicting entities separated by the '/' character in the new version of Rasa. Why? Rasa Open Source	3	497	June 11, 2020
Entity being extracted by multiple entity extractors breaks testing Rasa Open Source	3	556	July 19, 2021
How to specify an entity extractor to extract only specific entities Rasa Open Source	3	355	February 8, 2024

DIETClassifier splits one entity into list of token

intent: inform

Related topics