Do we have to include all examples of an entity while training the NLU Model?

Hi Guys,

I am building a bot for the purpose of handling travel related queries. It will have intents for booking and getting information about Hotels, Flights, Trains, etc. The queries will have the name of the cities as an entity. So, do I have to add training examples covering every city name or is there any way to create a processing pipeline where I just add a few cities in the training data examples and the pipeline will automatically recognize the city names outside of training data? The queries will look something like this:

  • Book a flight from london to new york
  • Show me hotels in miami
  • Train tickets for delhi

@akelad please help

@sainimohit23 You don’t need to cover all entities in training examples as that list can be too long. But you should definitely try to cover all “styles”/“patterns” in which your users would express that entity, just like the examples you have shown. For reference, checkout the city entity here in the training data of our open-sourced assistant carbon bot

Thanks for the clarification. Is there any specific pipeline component that I have to include to detect the city entities? Will it be able to pick up indian cities? In case entity classifier fails to detect them, is there any way to create a some dort of matcher for the undetected cities?

I have another doubt regarding the dates. How to handle dates and time as they come in various forms?