Lookup Table not working for DIET Classifier + RegexFeaturizer

Hey Folks,

I am trying to extract entity insurance_provider (name of insurance company) using DIET Classifier + RegexFeaturizer & lookup table. Entity extraction is failing for a lot of test examples, even though the entity value is present in the lookup table. :thinking:

I’ve added the config.yml file, lookup table, training & test examples here:

What I’ve observed is that if I keep the sentence structure of the test example the same as training, then entity extraction works. For e.g training example (where entity: royal sundaram):

i renewed it from royal sundaram

And i use the below example for testing then I am able to extract the entity (i.e tata aig)

i renewed it from tata aig

But If the test example has a different sentence structure (for e.g I chose tata aig) that is not seen in training, then it fails. Also, please note that this example: I chose tata aig is getting classified to the right intent.

Any idea what am I missing?

@riya.shah try using RegexEntityExtractor rather than RegexFeaturizer for annotated examples as you have shown. Thanks

Any idea why RegexFeaturizer is not working or any advantage if we use RegexEntityExtractor? Because DIET Classifier is also an entity extractor, I fear 2 same entity values might get extracted because of 2 extractors in the pipeline.

@riya.shah There is not solid difference a rasa team mention on the doc except this :

When using lookup tables with RegexFeaturizer, provide enough examples for the intent or entity you want to match so that the model can learn to use the generated regular expression as a feature.

When using lookup tables with RegexEntityExtractor, provide at least two annotated examples of the entity so that the NLU model can register it as an entity at training time.

@riya.shah I told you in other thread, he will only look and reply whatever you have mention in the training example, apart from that he will not respond. I face this issue when I tried to create for country and I mention other country which I not mention in lookup table, it not executed.

But try give some more example, at least 10+ may be it will train, as annotated by you

- intent: in_share_vendor_details
    examples: |
      - Insurance agent from [ICICI](insurance_provider) had called me
      - its from [LIC](insurance_provider)
      - i can't recall , probably through [icici](insurance_provider)
      - I bought a policy from [Tata](insurance_provider)
      - [HDFC Ergo](insurance_provider)

Even delete the older train version and re-train.

Hey @nik202 Yeah I have given around 40 annotated examples for this intent :sweat_smile:, just shared few examples here so that others can get an idea.

1 Like

@riya.shah Honestly :face_with_monocle: try to delete the old model and re-train it. I can see you will not stop. check this repo also rasa-demo/data/nlu/lookups at main · RasaHQ/rasa-demo · GitHub

Till the time you will give example from lookup he will reply, as you go out of sope from lookup, no reply :frowning: Right? This is happening?

See this also :slight_smile: Entity extraction with the new lookup table feature in Rasa NLU

@riya.shah read this also python - How rasa_nlu using lookup_tables for entity extraction? - Stack Overflow

For exact matching or partial you can switch to fuzzywuzzy.

Hey @nik202 No, I have all the examples in lookup, let’s say I have “Tata” as an example in the lookup table. Now if I give this message: “I have renewed it from Tata” then it is able to extract Tata. But if I give this message: “I have chosen Tata”, then there are no entities extracted.

@riya.shah right !

I bought a policy from [Tata](insurance_provider) What are Tata and insurance_provider here?

Tata is the entity value, insurance_provider is the entity type.

@riya.shah Yes, that’s why he always able to extract Tata as it had mentioned in lookup.