If I want to include an intent ‘search’ (which will query the store’s internal semnatic search engine via api), I have to include all the objects from the store’s catalog in the training data. Sometimes I define them just like individual words (‘entities’). The phrases of the question about the item in the store are few (e.g. ‘I would like to buy x’), but there are many objects (entitis). I have a stores that can have 10,000 items. That the assistant would know that when he writes: ‘I’m looking for a wardrobe’, he recognizes an intent search and when ‘I’m looking for the God’ it is an intent bot challenge. If I expand the search intent training data using the shop catalog, the intention classes will be very unbalanced. How to deal with this? Is it a good idea to treat the names of items in the store as synonyms? From buisness perspective I have to be sure when dealing with internet search and other intents.
if you have a semantic engine that can identify products from a sentence then i would not use a CRF or Lookup table because personally i feel adding products into training data would create imbalanced dataset, specially if you need to add 10k products.
Instead i would approach it using FormActions or CustomAction and use the message tokens to look for products mentioned and fill in a slot Or create an NLU component, that looks up the database to find product keywords mentioned in it and fill up the entity
Souvik that’s make sense for me. Thank you for help.