Name entity not extracting

sibbsnb · March 23, 2019, 12:43am

language:“en” pipeline:spacy_sklearn

Starter pack training data for getting the name doesn’t work for random names which is not listed in training. It should pick up any name right?

Training data: ## intent:name - My name is Juste <!— Square brackets contain the value of entity while the text in parentheses is a a label of the entity --> - I am Josh - I’m Lucy - People call me Greg - It’s David - Usually people call me Amy - My name is John - You can call me Sam - Please call me Linda - Name name is Tom - I am Richard - I’m Tracy - Call me Sally - I am Philipp - I am Charlie - I am Charlie - I am Ben - Call me Susan - Lucy - Peter - Mark - Joseph - Tan - Pete - Elon - Penny - name is Andrew - I Lora - Stan is my name - Susan is the name - Ross is my first name - Bing is my last name - Few call me as Angelina - Some call me Julia - Everyone calls me Laura - I am Ganesh - My name is Mike - just call me Monika - Few call Dan - You can always call me Suraj - Some will call me Andrew - My name is Ajay - I call Ding - I’m Partia - Please call me Leo - name is Pari - name Sanjay

srikar_1996 · March 23, 2019, 5:48am

Hi,

No, it won’t pick up all names because there are millions of names and Spacy won’t be able recognise. Best way to deal with names is to have a lookup table with a list of all the names. Take a look at this: lookup tables

sibbsnb · March 23, 2019, 6:44am

What’s the point of deep learning if names has to be a look. There are models that does it but looks like this one is not

srikar_1996 · March 23, 2019, 7:29am

We can’t expect it to work with millions of names. Spacy’s PERSON entity does a fairly good job wih names. But I noticed that it does not work well with Indian names.

netcarver · March 23, 2019, 11:14am

@sibbsnb I’ve had pretty good name recognition when using ner_crf in the NLU pipeline. I had to provide quite a lot of training data though (about 200 examples.)

Slightly off topic - @srikar_1996 If you are dealing mainly with Indian languages - have you seen the chatbot_ner project? They have support for entity recognition in English, Hindi, Gujarati, Marathi, Bengali and Tamil. I do not know if there is a Rasa Integration though.

srikar_1996 · March 23, 2019, 1:52pm

@netcarver I am using English itself. But it gives me issues with Indian names. I did not provide as many examples as you said. I might have given like 30~40 examples.

mauricedoepke · March 23, 2019, 2:06pm

Check out this link: displaCy Named Entity Visualizer · Demos · Explosion AI and try the name recognition there. If it works there then there is a problem in your pipeline if it doesn’t work there, then your names are to exotic for spacy and you need to train the nercrf yourself.

srikar_1996 · March 25, 2019, 5:02am

Yep, I always use this. I’ve to provide more examples to my crf I guess.

netcarver · March 25, 2019, 8:14am

@srikar_1996 as far as I know, spacy is pre-trained for entity recognition. Have you tried seeing if there are any differences between the small, medium and large models when you put your example names into the demo @mauricedoepke posted above?

For example, this text…

Pushpa went to the market with Getsy.

… only one name is recognised using the small model, while the medium model locates both.

You may get better recognition with a larger model.

sibbsnb · March 25, 2019, 3:18pm

Standford core nlp does a great job on person entity. The space doesn’t seem to detect it.

srikar_1996 · March 26, 2019, 2:50am

Hi, I’ve tried with small, medium and large models. It doesn’t always detect. For example, sm detects the first name, md doesn’t detect anything and LG detects the second name. Earlier you’ve mentioned that you provided close to 200 examples for your crf. I was wondering, providing that many examples is as good as having a lookup table right?

netcarver · March 26, 2019, 10:14am

@srikar_1996 You probably need feedback from someone with more experience using lookup tables/regexes than I have. However, my gut feel is that they are not the same.

The documentation states that lookup tables are only usable by ner_crf and that the entries in the table are combined to form one large, case insensitive, regex pattern that is then applied to the input text. It sounds like your recognition may be limited to just the example names if you were to go down that route - though I am not certain of that.

Last time I checked, there are way more names possible in English than I used in the ~200 examples I trained on.

Overall, it may be worth you doing a little experiment to see which method gives you better recognition.

srikar_1996 · March 27, 2019, 4:24am

Yea, I saw how lookup tables work. And yes, you’re right. Lookup tables are limited to the examples in the file. As of now, for my use case, lookup tables seem to do what I need. I’ll try out some other pipelines as well to see what works best.

ikenti · June 14, 2019, 6:46am

Continuing the discussion from Name entity not extracting:

Same problem here. As Sibish Basheer said: “what is the point with deep learning if names has to be a look?” When I say to the bot “My name is [John] (PERSON)”, with PERSON as a slot, I’d like the robot to answer “nice to meet you [John]”, whatever name I put between brackets. It should work like a function: for any data between [ ], answer: nice to meet you [data]. Same thing for translation. If I tell the bot: translate [this] (translation) to Spanish, with a custom action to translate the slot “translation”, I hope the robot will translate any data between the brackets. Is there a way to force the bot to do that? Thanks.

srikar_1996 · June 14, 2019, 7:11am

Hi,

This can be done using slots. Your template must have something like this:

utter_greet
 - nice to meet you {PERSON}

The person will be replaced with the slot value which is John in this case.

Alternately, you can also use custom actions to do the same.

ikenti · June 14, 2019, 8:31am

Yes, that is exactly what I did. But the problem is for many names, the bot doesn’t identify them as “names” in the slot. So the bot answers : Nice to meet you “None”. Instead of: Nice to meet you “the name”. I don’t know how to force the bot to repeat the [name], be it strange or uncommon.

srikar_1996 · June 14, 2019, 10:51am

If the bot is returning None, it means the nlu was unable to extract the Name. You can check the logs and see if the entity was identified and if the slot was filled. Names are sometimes difficult to identify because there can be millions of possibilities. In this case, provide the nlu with more examples or you can use a lookup table.

ikenti · June 14, 2019, 12:29pm

Yes, I will will try out as you suggest. But wouldn’t it be possible to force the bot to accept any data between the brackets as a name to repeat? So if I say: my name is umbrella, I would like the bot to answer: Nice to meet you “umbrella”, instead of nice to meet you “None”. Even if umbrella is not a given name. Why is it so difficult to get that result? Did I miss something?

srikar_1996 · June 15, 2019, 11:29am

Actually, if the sentences are similar in structure like My name is xxxxxx then the bot will pick it up even if it’s not a PERSON entity because there is no way the algorithm would know that it isn’t a human name. Maybe this has something to do with the spacy’s PERSON entity.

Try using your own ner_crf instead of PERSON and see what happens, I’m guessing it should work.

twittmin · August 5, 2019, 6:09pm

Does a ‘name’ have to occur at least once in the training data for CRF to be able to recognize it?

Topic		Replies	Views
Get Person names as entity from user input Rasa Open Source	3	1041	February 1, 2021
Issue with entity detection - fails to detect outside of the training set Rasa Open Source	4	3115	February 6, 2019
Indian name recognition.(name entity recognition) (Regional name recognition) works best in recognition of name Rasa Open Source	7	3617	November 21, 2023
Entities aren't being extracted as intented Rasa Open Source	3	348	September 11, 2020
Can I Extract Persons Name by not using SpaCy's "PERSON" entity, Because It has some problems Rasa Open Source	1	459	October 7, 2021

Name entity not extracting

Related topics