Recognise entities that contain more than one word

Hi, I would like to know how can I make the bot recognize entities that consist of more than one word aka entities that have a space between them. Eg: if my entity is city and my input is New York. Bot detects input as [New] (city) [York] (city) I want bot to detect it as [New York] (city) I also trained my bot with a lookup table that consist of all cities. But I am unsure how to make the bot ignore the space between the city names.

It worked for me, that I do for example: I love [New York] (city) If you make the entity [New York] and then the (city) without any space between it, it should work.

Yes I do that manually while training it, but I’m not sure if that’s helpful during the dialogue management phase. It needs to auto detect the entity without any explicit intervention. I need the bot to understand New York as an entity while ignoring the space in between. This problem only occurs when city consists of more than one word. Like New York,New Jersey or New Orleans. I don’t have a problem if the city name is singular, like Chicago, Paris or London

If you have some examples like [New York](city) where the city name consist out of two words in your nlu training data. The cities should be picked up correctly, entities are not limited to one word.

I have a list of cities stored in Lookup tables, but I still don’t understand why it won’t detect entities properly. Is there any way to check whether entities are able to read Lookup tables?

It should be able to detect them then, there might be something wrong with the formatting of your data file so that those entities are not recognized in training.

Hello @Nexemics!

If the issue is solved, then that’s fine and all the best for your future endeavors!

If not, try this:

  1. Check your training data. Training data must be diverse and should contain lot of examples for single intent so that Rasa generalizes data\entities gracefully.
  2. Create lookup table for cities (Make sure the names are correctly formatted, lookup tables are case sensitive) and add it to the nlu.md training data.
  3. Train the model and then check if it works (Ideally it should work).

Make sure the lookup table is simple and has just list of texts, for e.g.: Mumbai Delhi New York New Orleans Chicago … … . save it with name like “city.txt” and add it at the very end of the nlu.md file like this:

lookup:city

data/lookup_tables/city.txt

Here, lookup_tables is the folder that I created for storing all lookup txt files.

Train the model like normally you would do and test it out. If this doesn’t work out, could you please let me know share your repo’s link? I’ll be happy to help you out in this case. :slight_smile:

1 Like