Regex entity names

Hello guys,

I’m working on capturing the ‘names’ of the persons. I wrote almost 50+ different ways/names in nlu.md file for training, wrote a regex as well but still if I give a new name the nlu engine is not able to recognise it. This is how my nlu.md file looks -

regex:name

  • ^[A-Za-z ]+$

intent:get_name

I’m using rasa 1.0.7 and I have pipeline: ‘pretrained_embeddings_spacy’

Could someone please guide me as to what must I do to fix this.

Thanks in advance

Not sure, but have you tried not using Regex? It might be featurizing other words that are not names.
Also, I think that the regex is wrong here. It matches all the characters, followed by a space, at the end of the sentence. Without $, it matches the entire sentence. With $, it matches nothing unless you have a space at the end of input.

Thanks for your reply Akshay2000

I started without regex but to no avail. As you mentioned, I will fix the regex now and test it.

Nope, still doesn’t work. The moment I provide a new name, the entity (name) doesn’t get fill. :frowning:

As @akshay2000 mentioned, regex is not a good idea for names. Have you tried using custom crf and spacy’s built in recognizer?

Also, you could look at lookup tables, they work the best for names.

hey @rmiiitb as suggested by @srikar_1996 you can use CRF entity extractor since you are using custom entities, you can read about it here:

Thanks srikar, JG ! Really appreciate the pointers, will work on them.

Hey guys, I did this -

regex:name

  • ^[a-zA-Z]+(([’,. -][a-zA-Z ])?[a-zA-Z]*)$

pipeline:

  • name: “SpacyNLP”
  • name: “SpacyTokenizer”
  • name: “SpacyFeaturizer”
  • name: “SklearnIntentClassifier”
  • name: “EntitySynonymMapper”
  • name: “RegexFeaturizer”
  • name: “CRFEntityExtractor” features: [ [“low”, “title”, “upper”], [“bias”, “low”, “prefix5”, “prefix2”, “suffix5”, “suffix3”, “suffix2”, “upper”, “title”, “digit”, “pattern”], [“low”, “title”, “upper”] ]

While running, I gave name as : sasha petrov and got below debug statements - 2019-06-20 10:34:25 DEBUG rasa_sdk.forms - Validating user input ‘{‘intent’: {‘name’: ‘get_name’, ‘confidence’: 0.9944517383711481}, ‘entities’: [], ‘intent_ranking’: [ {‘name’: ‘get_name’, ‘confidence’: 0.9944517383711481}, {‘name’: ‘affirm’, ‘confidence’: 0.0008237922520789642}, . . {‘name’: ‘finance_module’, ‘confidence’: 0.00017543870013493048}], ‘text’: ‘sasha petrov’}’

2019-06-20 10:34:25 DEBUG rasa_sdk.forms - Trying to extract requested slot ‘name’ … 2019-06-20 10:34:25 DEBUG rasa_sdk.forms - Got mapping ‘{‘type’: ‘from_entity’, ‘entity’: ‘name’, ‘intent’: [‘get_name’], ‘not_intent’: []}’ 2019-06-20 10:34:25 DEBUG rasa_sdk.forms - Failed to extract requested slot ‘name’ 2019-06-20 10:34:25 ERROR rasa_sdk.endpoint - Failed to extract slot name with action initial_info_form

I will now try without regex.

@rmiiitb did you get anywhere with this? I’m having a similar problem perhaps the regex recognizer needs to be configured?

It seems you would need to add it to your pipeline config.yml

Hi Rahul,

If you want to extract the name of person then you simply use spacy entity extractor. For that you need to install spacy.

add following in your config file

  • name: “SpacyNLP”
  • name: “SpacyEntityExtractor” dimensions: [“PERSON”,“LOC”,“ORG”,“PRODUCT”]

while slot mapping use entity as a person self.from_entity(entity=“PERSON”),

1 Like