Pipeline components priority

Hello all,

I have a question regarding the pipeline components. I am using the DIETClassifier, DucklingEntityExtractor, and SpacyEntityExtractor in my pipeline.

The DucklingEntityExtractor is used to extract email addresses and phone numbers. The SpacyEntityExtractor is used to extract names of people. The DIETClassifier is used to extract other entities like for example: cuisine type.

However, the DIETClassifier also extracts names of people, email IDs and phone numbers and when it does I get a list of these entity values. Is there any way I can resolve this such that the DIETClassifier only extracts the custom entities like cuisine type?

in your training data, you don’t need to tag entities that you don’t want DIET to extract. Spacy and Duckling don’t use training data but rather uses the input message and extract entities based on patterns(duckling) and pre-trained models(SpaCy).

You don’t need to “train” those components, so if you remove the annotations of the entity you have dont for names, email and phone number from your training data, DIET will stop classifying it