Best method for detecting product names entities?

Product names are typically very long in ecommerce sites. for eg : “EYEBOGLER V-Neck Shawl Collar Stylish Men’s Solid T-Shirt”

the user might not even give the entire product names in the conversations. Also typos are expected too.

Option 1 : use lots of training examples for the model to learn. the problem is model might overfit to the programatically generated examples.

Option 2: use lookup tables to list down all product names. A regex match is done in this case. Here the problem is if we factor in the variations in which a user utters a product name (with only some parts of the name, with typos etc) the list can grow really big.

Which option is better to use and do we have any other way of solving this?

Hi @dingusagar

Looks like an interesting problem. May i know the scenario under which user might utter the product names.

how does the conversation flow look like ?

user is asking “what is the price for V-Neck Shawl Collar T-Shirt in store xyz”

another query “how many reviews for V-Neck Shawl Collar T-Shirt”

Hi @dingusagar

I would suggest to break down the product name into multiple entities as below

  • collar_type
  • shirt_type
  • neck_type

Once you extract these entities, Perform a keyword search over your Product name list to narrow down to exact product.

Sometimes, you might end up with more than one product which you can reconfirm with user by listing down each product name with buttons.

Hi @siriusraja, thanks for the reply.

breaking down the product name into different entities works for a particular category like t-shirts. But how do we make it scalable for all sorts of product categories. for example smartphones, earphones, clothing etc.

Hi @dingusagar

How many categories are there ? and can you share the list of product names so that i can think of some other efficient way.

Hi, sorry for the late reply.

I am looking for a general entity recongnition solution for any ecommerce products names belonging to various categories. for example, lets consider the flipkart products available in this public dataset Flipkart Products | Kaggle

Hi @dingusagar

In such case, i would build a common set of entities and look for these entities in the user utterance.

  • color
  • gender
  • product
  • product_model
  • brand_name

(EYEBOGLER)[brand_name] (V-Neck)[product_model] Shawl Collar Stylish (Men’s)[gender] Solid (T-Shirt)[product]

Based on the extracted values, perform a keyword search on the product table. If there are multiple product names in the resulting search, show them to the user and ask for confirmation.