Similar Entity Extraction

ashukrishna100 · September 17, 2018, 11:13am

I am facing problem in extracting the entity for following use case. User:Hi Bot:hello! How may I help you User:I want my tax reciept Bot: Sure! Could you provide me your ID User:59(he can also say like:it is 59 or ID is 59 or ID=59 or my ID is 59) Bot: Your User number? User: it is 9154 Bot:Your transaction number user: 745

I want to extract 59,9154,745. How to proceed with this problem? I have trained the data but when I provide ID and user number of the same length. the entity is not extracted. However intent is working fine and the the conversation is happening as per the story.

akelad · September 17, 2018, 11:15am

have you tried using the ner_duckling_http entity extractor?

ashukrishna100 · September 17, 2018, 11:17am

I am using ner_crf container

akelad · September 18, 2018, 3:57pm

for numbers, please try the ner_duckling_https extractor

ashukrishna100 · September 18, 2018, 4:11pm

Actually installing the Duckling is a tedious task. I m getting too many errors while installing it.

akelad · September 19, 2018, 4:19pm

you don’t need to install it. you can run it with docker https://rasa.com/docs/nlu/master/components/#ner-duckling-http

andrewbain · September 24, 2018, 10:26am

Hi there, I have it the duckling server running in docker and it seems to be working however it seems to have slowed down my training to snail pace. I am also unsure how to enter training data as duckling creates it’s own entities, do I just enter the intent and add the duckling entity names to my rasa_core config?

akelad · September 26, 2018, 10:00am

hmmm, duckling shouldn’t affect your training at all. and duckling just extracts the entities, no need to label them

ashukrishna100 · September 26, 2018, 2:50pm

What if the inputs are alphanumeric for e.g. my ID is NP_45680780. And how to correct order in case user replies in a different order. E.g. BOT-Can I have Ur ID? USER- my user number is NA_56098

And one more doubt,how to extract desired entity from multiple entities As : my transaction number is QR%56873578 for user number 987_AT. I have to extract QR%56873578

ccelotto · October 1, 2018, 6:49pm

I’m wondering the same thing here. It doesn’t seem like Duckling is the solution for custom entities such as alphanumeric codes like you and I are working with.

Does anyone have any suggestions for extracting custom alphanumeric codes? Duckling and entity lookup tables haven’t seemed to work after some testing. Maybe creating some Regex patterns would work?

Thanks!

souvikg10 · October 1, 2018, 8:28pm

You have in this page at the bottom a description about Regex patterns.

ccelotto · October 1, 2018, 8:37pm

Thank you. I’m familiar with the Regex documentation, but zip codes and phone numbers seem to be a bit different thank custom alphanumeric codes since zip codes and phone numbers always have the same format while unique alphanumeric codes do not.

I was more so looking for some thoughts on the effectiveness (and if it is possible) to use Regex patterns for a situation such as this.

souvikg10 · October 2, 2018, 7:38am

you can also have a regex pattern for alphanumeric codes as well

I take the example of userID

let’s say it is 6 digits and starts with G

G([1-9]\d{4})

Then I should provide examples such as

my id is G15367 [ID] …

In another way, you can also add regex entity extractor, that takes a regular expression pattern as rule and find entities from a given token (similar to duckling)

also FYI, in duckling you can add custom rules if you have a hang on Haskell. They have recently added a new feature to add custom dimension.

ccelotto · October 2, 2018, 4:56pm

Awesome! Thank you. I can confirm for @ashukrishna100 that this does indeed work. I used a few guides online that provided regex variable charts to come up with the regex pattern suited for our use case. Performance is great after implementing a regex pattern!

ashukrishna100 · October 9, 2018, 5:52am

Yeah it works and with optimum performance for sure. Thank you so much @ccelotto

ashukrishna100 · October 9, 2018, 5:56am

Very informative. no doubts you are a RASA star. have tried regex entity extractor and it works fine. Thanks @souvikg10

neerajb1 · October 24, 2018, 10:17am

Hi ,

Could you please share how it works. I am also using it but no luck.

NLU: “regex_features”: [ { “name”: “Transaction_ID”, “pattern”: “^[1]+$” }]

pipeline:

name: “tokenizer_whitespace”
name: “intent_entity_featurizer_regex”
name: “ner_crf”
name: “ner_synonyms”
name: “intent_featurizer_count_vectors”
name: intent_classifier_tensorflow_embedding

Thanks

a-zA-Z0-9 ↩︎

ccelotto · October 25, 2018, 2:37am

This was super helpful for me in creating the regex for my use case http://www.cbs.dtu.dk/courses/27610/regular-expressions-cheat-sheet-v2.pdf

From looking at your regex pattern, it seems like it may be missing some parentheses/brackets.

This is mine: (([A-z]{1})([0-9]{6,7}))

It is used for picking up on alphanumeric codes similar to “A123456”, “G493024”, “F4930294”.

So breaking it down… (([A-z]{1})([0-9]{6,7}))… this bolded section states that the first character {1} will be a letter A-Z [A-z].

(([A-z]{1})([0-9]{6,7}))… this bolded section states that the next 6-7 characters {6,7} will be numbers 0 through 9 [0-9].

Hopefully you can use this as a reference! If not, tell me what you’re trying to accomplish, and I’ll do my best to help you craft it.

P.S. don’t forget (like mentioned in Rasa docs for regex) to provide some training examples using the regex or the model won’t know to pick up on the pattern.

neerajb1 · October 26, 2018, 4:33am

Hi Christopher,

Thank you very much for detailed explanation. It worked i for me . I think it might be due to i forgot including intent_entity_featurizer_regex in config file

Topic		Replies	Views
Number type entity extraction Rasa Open Source	3	2849	August 14, 2021
Duckling and Rasa Rasa Open Source	1	432	May 28, 2021
Not getting proper response after adding Duckling Rasa Open Source	22	3193	December 5, 2018
Entity extraction duplicated using duckling and ner_crf pipeline Rasa Open Source	3	1598	March 25, 2019
Entity Extractor for Rasa(Alternative for Duckling) Tutorials, Resources & Videos	3	1784	October 29, 2020

Similar Entity Extraction

Related topics