How to design Rasa NLU training data for extracting human name

I am using rasa open source version 3.1 .

In my project I need to extract the user’s name from user text and set the value to slot which type is text. Now the issue is for extracting varieties kinds of name I have to give a huge amount of training example under that particular intent, which is around 40000. And now it is breaking my stories flow. Is there any way to optimize this training data and solve this issue?

It will be a great help if any one can answer. Thanks in advance.

Hi @chowdhuryshakur

You can reframe your training data following this https://rasa.com/docs/rasa/nlu-training-data#regular-expressions-for-rule-based-entity-extraction

Thanks @anoopshrma, I used regex for number type slot like phone number, age. But how I can use it for name slot, can you give a example?

- intent: user_name
  examples: |
    - my name is [John Doe]{"entity": "name"}
    - You can call me [IronMan]{"entity": "name"}

You haven’t used any regex in your example.

However my example is like below:

- intent: user_name
  examples: |
    - [John Doe](name)
    - [IronMan](name)

Sorry for the mixup, I shared the wrong link with you, extracting name can be done without even adding regex.

This way you can extract the name from the text. For more info you can refer to this https://rasa.com/docs/rasa/training-data-format#entities

Also dont add training data like this

One word data is not good way to train an intent

If user write only his name instead of ‘my name is Anoop’ and ‘Anoop’ is not in nlu data. Will it work? Can extract the name Anoop?

It can @chowdhuryshakur , Have all the possible combinations that can occur in your nlu.yml Which means

- intent: user_name
  examples: |
    - my name is [John Doe]{"entity": "name"}
    - You can call me [IronMan]{"entity": "name"}
    - [Rahul]{"entity": "name"}

The reason why i asked you to not add only single word training data as your training data only had the names. The above way can handle single as well as name coming in sentence as well

This is my nlu data

- intent: tell_name
  examples: |
    - My name is [Justine Kuper]{"entity": "name"}.
    - I am [John]{"entity": "name"}.
    - I am [Anaz Abdul Karim]{"entity": "name"}.
    - [Jack]{"entity": "name"}.
    - [Borhan]{"entity": "name"}.

But when I am writing Anoop, it couldn’t extract the name.

Sorry for mentioning @nik202. Can you help me anyway? I saw you answered many of questions in the forum.

@chowdhuryshakur no worries. You need to check a few links: https://learning.rasa.com/ and NLU Training Data and on the rasa youtube channel check for entity extraction you will find a lot of use case examples.

Hope this will solve your issue. Good Luck!

@nik202 Thanks for replying.

My issue is not that much straight forward. Here I have a intent tell_name. From this I have to extract any kinds of name given by user. I tried to train my model by giving 40000+ examples under tell_name intent. After that my model was able to extract most of the name but my other story flows were broken.

Should I avoid this huge number of example under a single intent? Is there any way to extract human name (name from anywhere in the world)?

hello,

have you tried to add a standard entity value in your NLU training data?

- intent: tell_name
  examples: |
    - My name is [Justine Kuper]{"entity": "name", "value": "user_name"}.
    - I am [John]{"entity": "name", "value": "user_name"}.
    - I am [Anaz Abdul Karim]{"entity": "name", "value": "user_name"}.
    - [Jack]{"entity": "name", "value": "user_name"}.
    - [Borhan]{"entity": "name", "value": "user_name"}.

that will allow your NLU model extract name entity with synonym mapping, which avoid exploding your stories.

by doing that you won’t be able to get the real name directly from parsed message, as they are all mapped to “user_name”, but I guess you can still extract them with start/end index of extracted name entity.

2 Likes