Mitie Entity Extraction Not working properly

I am working on testing how MitieEntityExtractor works compared to Duckling or Spacy in identifying entities like: phone_no, time, date, home_address, email_address, amount_of_money and organisation. But there is very little documentation on this. (I am using Rasa version 2.6)

  1. Firstly, I am unsure if I have set this up correctly. Can someone tell me if this is correct? This is my config.yml that has the MitieEntityExtractor:
pipeline:
# No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# If you'd like to customize it, uncomment and adjust the pipeline.
# See https://rasa.com/docs/rasa/tuning-your-model for more information.
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
    constrain_similarities: true
  - name: "MitieNLP"
    # language model to load
    model: "data/total_word_feature_extractor.dat"
  - name: "MitieEntityExtractor"
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
    constrain_similarities: true
  - name: FallbackClassifier
    threshold: 0.3
    ambiguity_threshold: 0.1

and I have found the total_word_feature_extractor.dat file at https://github.com/SigmaQuan/rasa_chatbot/tree/master/data.

In addition, my nlu.yml data has been labelled and they look a bit like this:

- intent: callback
  examples: |
    - I would like to arrange a callback [at 3](time)
    - I would like to arrange a callback
    - call me back tomorrow [at 1 pm](time)
    - arrange call [Monday the 13th](date) [at 13:20](time)

.

  1. Once I trained the bot, I seem to have a duplication of entity labels identified. It seems that both the DIETClassifier and MitieEntityExtractor both extract their own entities. Here is an example from the events tracker:
{"entity":"date","start":0,"end":8,"confidence_entity":0.8962544202804565,"value":"Thursday","extractor":"DIETClassifier"},
{"entity":"date","start":10,"end":25,"confidence_entity":0.46212172508239746,"value":"13th of January","extractor":"DIETClassifier"},

{"entity":"date","value":"Thursday 13th of January","start":0,"end":25,"confidence":null,"extractor":"MitieEntityExtractor"}]

Why does this happen? Is it because of my incorrect configuration??

Thank you in advance.

Hi,

You can remove DIET classifier from your config file. Try commenting the below lines:

name: DIETClassifier

epochs: 100

constrain_similarities: true

I just tried this configuration. The bot stops working and the intents are not identified with user input :thinking:

@anne576 - you can set entity_recognition to False for DIET, this way DIET won’t try to predict entities

- name: DIETClassifier
   epochs: 100
   constrain_similarities: true
   entity_recognition: false

Souvik Ghosh is right, another way, you can remove the entity in your train data:

- call me back tomorrow at 1 pm

In your original train data, you have tell diet how to exract the entity.