Custom Entity and Relation Extraction

sourayb · August 22, 2018, 1:30pm

Hello everyone, I have a specific requirement as below:

I have a doc which has text - “The named insurer is ABC and his Date of Birth is 1/01/2001.”

For the above text I have the training_data.json has

{ “rasa_nlu_data”: { “common_examples”: [ { “text”: "The named insurer is ABC and his Date of Birth is 1/01/2001. ", “intent”: “WhoIsPolicyHolder”, “entities”: [ { “start”: , “end”: , “value”: “ABC”, “entity”: “NAMED INSURED” } ] } { “text”: "The named insurer is ABC and his Date of Birth is 1/01/2001. ", “intent”: “WhatIsDOBofInsurer”, “entities”: [ { “start”: , “end”: , “value”: “1/01/2001”, “entity”: “DOB” } ] }

I have created a model with 10 intents for each WhoIsPolicyHolder and WhatIsDOBofInsurer. This is what I got as output:

{‘intent’: {‘name’: ‘WhatIsDOBOfPolicyHolder’, ‘confidence’: 0.5965509303964964}, ‘entities’: [], ‘intent_ranking’: [{‘name’: ‘WhatIsDOBOfPolicyHolder’, ‘confidence’: 0.5965509303964964}, {‘name’: ‘WhoIsPolicyHolder’, ‘confidence’: 0.4034490696035035}], ‘text’: ‘The named insurer is XYZ and his Date of Birth is 2/02/2002.’}

Can we get the output with WhoIsPolicyHolder, WhatIsDOBofInsurer relations(the entity and value of the test data) instead of only the intent with cofidence score?

deepshet · August 22, 2018, 2:26pm

You seem to have multiple intents for the same text , so to pick that up you need to do https://blog.rasa.com/how-to-handle-multiple-intents-per-input-using-rasa-nlu-tensorflow-pipeline/

Your second problem is that your entities arent being picked up - which might indicate an issue with your data. In the response you can see your entities array is empty (also be careful with dates!)

sourayb · August 22, 2018, 3:00pm

Hi Deepak, You are right, my entities are not picked in the output text. Can you please help what i can do to fix that.?

deepshet · August 22, 2018, 4:03pm

Its hard to say without your input data (and your pipeline etc). Its usually incorrect or too little training data . if you can share your data and pipeline file , someone might take a look. If you cant then reduce the problem (to say one entity and one intent and just train that and see.

sourayb · August 23, 2018, 4:09am

Hi Deepak,

My input data is --some text --The named insurer is ABC and his Date of Birth is 1/01/2001 --some text–.(Repeated multiple times with different name and DOB).

My pipeline is :

language: “en” pipeline:

name: “nlp_spacy” model: “en”
name: “tokenizer_spacy”
name: “ner_crf”
name: “intent_featurizer_spacy”
name: “intent_classifier_sklearn”

deepshet · August 23, 2018, 5:33am

I tried with spacy_sklearn (your pipeline has an incorrect order and is missing some stuff if you were intending to use that - see https://rasa.com/docs/nlu/pipeline/#section-pipeline And it worked for me (single intent though - as before if you want multiple intent you need tensorflow_embedding)

language: en
pipeline: spacy_sklearn

Here is the chatito file I used to generate data for the NLU

%[greet]
    Hi
    Hello
    Howdy

%[WhoIsPolicyHolder]('training': '100')
    The named insurer is @[NAMEDINSURED] and his Date of Birth is @[DOB]

@[NAMEDINSURED]
    ABC
    DEF
    GHI
    TEXT
    abdfgh
    bghtery
    qwerty
    singte
    AsdF
    BlahBlah

@[DOB]
    01/01/2001
    04/12/2013
    03/11/1987
    02/12/1987
    01/18/1964
    11/23/1945
    12/12/2012
    07/11/1999
    03/14/2017
    07/07/2007

sourayb · August 23, 2018, 6:09am

Thanks Deepak, Let me try with spacy_sklearn and see the output.

Moreover in my config.yml if I give

language: “en”

pipeline:

name: “tensorflow_embedding”

Throws me an error with Exception: Failed to find component class for ‘tensorflow_embedding’. Unknown component name. Check your configured pipeline and make sure the mentioned component is not misspelled. If you are creating your own component, make sure it is either listed as part of the component_classes in rasa_nlu.registry.py or is a proper name of a class in a module.

Any idea how to resolve this?

sourayb · August 23, 2018, 6:29am

Just changed config.yml to

language: “en”

pipeline: “tensorflow_embedding”

its working then.

Topic		Replies	Views
Named Entity Mentions as they relate to Intents Rasa Open Source	6	1990	December 18, 2019
Intent based entity extraction Rasa Open Source	1	539	December 23, 2019
Intent Matching to be affected by Entity Extracted Rasa Open Source	14	1227	June 8, 2020
Extract the same entity in different contexts Rasa Open Source	17	2552	May 8, 2019
RASA NLU: Multiple entity extraction from Single intent Rasa Open Source	2	1506	January 7, 2020

Custom Entity and Relation Extraction

Related topics