Recognise entities that contain more than one word

Nexemics · March 20, 2019, 7:32pm

Hi, I would like to know how can I make the bot recognize entities that consist of more than one word aka entities that have a space between them. Eg: if my entity is city and my input is New York. Bot detects input as [New] (city) [York] (city) I want bot to detect it as [New York] (city) I also trained my bot with a lookup table that consist of all cities. But I am unsure how to make the bot ignore the space between the city names.

M1ns · March 21, 2019, 6:12pm

It worked for me, that I do for example: I love [New York] (city) If you make the entity [New York] and then the (city) without any space between it, it should work.

Nexemics · March 21, 2019, 6:14pm

Yes I do that manually while training it, but I’m not sure if that’s helpful during the dialogue management phase. It needs to auto detect the entity without any explicit intervention. I need the bot to understand New York as an entity while ignoring the space in between. This problem only occurs when city consists of more than one word. Like New York,New Jersey or New Orleans. I don’t have a problem if the city name is singular, like Chicago, Paris or London

pwessel · April 3, 2019, 3:15pm

If you have some examples like [New York](city) where the city name consist out of two words in your nlu training data. The cities should be picked up correctly, entities are not limited to one word.

Nexemics · April 4, 2019, 3:33pm

I have a list of cities stored in Lookup tables, but I still don’t understand why it won’t detect entities properly. Is there any way to check whether entities are able to read Lookup tables?

pwessel · April 4, 2019, 5:37pm

It should be able to detect them then, there might be something wrong with the formatting of your data file so that those entities are not recognized in training.

xames3 · August 15, 2019, 1:45pm

Hello @Nexemics!

If the issue is solved, then that’s fine and all the best for your future endeavors!

If not, try this:

Check your training data. Training data must be diverse and should contain lot of examples for single intent so that Rasa generalizes data\entities gracefully.
Create lookup table for cities (Make sure the names are correctly formatted, lookup tables are case sensitive) and add it to the nlu.md training data.
Train the model and then check if it works (Ideally it should work).

Make sure the lookup table is simple and has just list of texts, for e.g.: Mumbai Delhi New York New Orleans Chicago … … . save it with name like “city.txt” and add it at the very end of the nlu.md file like this:

lookup:city

data/lookup_tables/city.txt

Here, lookup_tables is the folder that I created for storing all lookup txt files.

Train the model like normally you would do and test it out. If this doesn’t work out, could you please let me know share your repo’s link? I’ll be happy to help you out in this case.

Topic		Replies	Views
Multiple words as entity Rasa Open Source	4	2325	December 13, 2021
Multiple word entity detected as more entities Welcome to the Rasa Community Forum!	0	644	October 27, 2021
Issue with entity detection - fails to detect outside of the training set Rasa Open Source	4	3115	February 6, 2019
Rasa is not extracting entities with spaces in lookup table Rasa Open Source	0	329	November 14, 2022
Entities with punctuation and space are not recognized Rasa Open Source	4	566	March 3, 2021

Recognise entities that contain more than one word

lookup:city

Related topics