I am trying to extract entity insurance_provider (name of insurance company) using DIET Classifier + RegexFeaturizer & lookup table. Entity extraction is failing for a lot of test examples, even though the entity value is present in the lookup table.
I’ve added the config.yml file, lookup table, training & test examples here:
What I’ve observed is that if I keep the sentence structure of the test example the same as training, then entity extraction works. For e.g training example (where entity: royal sundaram):
i renewed it from royal sundaram
And i use the below example for testing then I am able to extract the entity (i.e tata aig)
i renewed it from tata aig
But If the test example has a different sentence structure (for e.g I chose tata aig) that is not seen in training, then it fails. Also, please note that this example: I chose tata aig is getting classified to the right intent.
Any idea why RegexFeaturizer is not working or any advantage if we use RegexEntityExtractor?
Because DIET Classifier is also an entity extractor, I fear 2 same entity values might get extracted because of 2 extractors in the pipeline.
@riya.shah There is not solid difference a rasa team mention on the doc except this :
When using lookup tables with RegexFeaturizer, provide enough examples for the intent or entity you want to match so that the model can learn to use the generated regular expression as a feature.
When using lookup tables with RegexEntityExtractor, provide at least two annotated examples of the entity so that the NLU model can register it as an entity at training time.
@riya.shah I told you in other thread, he will only look and reply whatever you have mention in the training example, apart from that he will not respond. I face this issue when I tried to create for country and I mention other country which I not mention in lookup table, it not executed.
But try give some more example, at least 10+ may be it will train, as annotated by you
- intent: in_share_vendor_details
examples: |
- Insurance agent from [ICICI](insurance_provider) had called me
- its from [LIC](insurance_provider)
- i can't recall , probably through [icici](insurance_provider)
- I bought a policy from [Tata](insurance_provider)
- [HDFC Ergo](insurance_provider)
Hey @nik202 No, I have all the examples in lookup, let’s say I have “Tata” as an example in the lookup table. Now if I give this message: “I have renewed it from Tata” then it is able to extract Tata. But if I give this message: “I have chosen Tata”, then there are no entities extracted.