Unable to classify multiple examples of the same entity. Please help

shubhamnatraj · May 14, 2020, 12:50pm

This is an example of what my training data looks like

## intent:ask_ingredients

- my ingredients are [tomato](ingredients)  [bacon](ingredients)  [berries](ingredients) 

- my ingredints are [beans](ingredients)  and [baguette](ingredients) 

- i have [banana](ingredients)  [baking bar](ingredients)  and [beer](ingredients) 

- in my fridge there is [chia seeds](ingredients)  and [chestnuts](ingredients) 

- i have ingredients that are [cheese](ingredients)  [cereal](ingredients) 

- my ingredients are [yogurt](ingredients)  [bacon](ingredients)  [berries](ingredients) 

- my ingredients are [wine](ingredients)  [bacon](ingredients)  [berries](ingredients) 

- my ingredients are [walnuts](ingredients)  [raw shrimp](ingredients)  [berries](ingredients) 

- my ingredients are [turkey](ingredients)  [bacon](ingredients)  [berries](ingredients) 

- my ingredients are [toast](ingredients)  [red wine](ingredients)  [berries](ingredients) 

- my ingredients are [steaks](ingredients)  [bacon](ingredients)  [berries](ingredients) 

- my ingredients are [sugar](ingredients)  [bacon](ingredients)  [berries](ingredients) 

- my ingredients are [spaghetti](ingredients)  [bacon](ingredients)  [berries]](ingredients) 

- my ingredients are [seeds](ingredients)  [pumpkin](ingredients)  [berries](ingredients) 

- my ingredients are [salsa](ingredients)  [bacon](ingredients)  [pistachios](ingredients) 

- my ingredients are [wine](ingredients)  [bacon](ingredients)  [berries](ingredients)

- my ingredients are [walnuts](ingredients)  [raw shrimp](ingredients)  [berries](ingredients) 

- my ingredients are [turkey](ingredients)  [bacon](ingredients)  [berries](ingredients) 

- my ingredients are [toast](ingredients)  [red wine](ingredients)  [fish](ingredients) 

- my ingredients are [steaks](ingredients)  [bacon](ingredients)  [garlic](ingredients) 

- my ingredients are [sugar](ingredients)  [bacon](ingredients)  [ginger](ingredients) 

- my ingredients are [spaghetti](ingredients)  [bacon](ingredients)  [berries](ingredients) 

- my ingredients are [seeds](ingredients)  [pumpkin](ingredients)  [berries](ingredients) 

- my ingredients are [lemon](ingredients)  [ketchup](ingredients)  [juice](ingredients)

This is an example of the output I am getting

my ingredients are bacon berries cheese
{
  "intent": {
    "name": "ask_ingredients",
    "confidence": 0.9999998807907104
  },
  "entities": [
    {
      "entity": "ingredients",
      "start": 19,
      "end": 39,
      "value": "bacon berries cheese",
      "extractor": "DIETClassifier"
    },
    {
      "entity": "ingredients",
      "start": 19,
      "end": 39,
      "confidence_entity": 0.7563967162518996,
      "value": "bacon berries cheese",
      "extractor": "CRFEntityExtractor"
    }
  ],

I tried both DIET and CRF but both of them are giving me the same result. Why are my entities not being recognized separately as 3 ingredients ?

What could I do to make it such that it is classified correctly ?

koaning · May 18, 2020, 6:56am

I can’t say I know what is happening internally. I observe that there are two spaces in your training data instead of one, so that might be causing some confusion. But this does look like strange behavior.

I can point you towards something that might help in the meantime though: lookup tables!

Here’s an example from the pokedex demo.

## intent:confirm_exists
- is [bulbasaur](pokemon_name) a pokemon
- does [ninetails](pokemon_name) exist
- ever heard of [pikachu](pokemon_name)

## lookup:pokemon_name
  data/pokenames.txt

The idea is that the textfile contains a long list of things to match against and this may make it easier to detect the right ingredients.

shubhamnatraj · May 18, 2020, 2:11pm

Thanks for the response @koaning

I am definitely using lookup tables in places where I have entities that have a discrete set of values they can take. I didn’t think of using it here but I will look into the possibility.

Could you tell me if this behavior is expected or is it expected to be able to pick up each entity separately?

@akelad @Tanja would you have any ideas about this ?

koaning · May 18, 2020, 2:20pm

In demos that I have done with that pokedex bot I’ve seen that it picks up the entities seperately, albeit with the word and in between as a seperator. It does occur to me as unexpected behavior but I cannot immediately pin-point what is causing it.

Tanja · May 18, 2020, 2:26pm

This behaviour is actually expected. If you have multiple tokens next to each other and they all have the same predicted entity type, we assume that they actually belong together. For example if you have an entity extractor that extracts cities, you want “San Fransisco” to be classified as one city and not as two citifies “San” and “Fransisco”. So, if you add the word and to your sentence (in-between the ingredients), the ingredients are picked up separately.

shubhamnatraj · July 11, 2020, 11:59am

@noman here’s the post i was referring to, it is expected behavior

noman · July 11, 2020, 12:08pm

Thanks @shubhamnatraj for pointing me to this. @Tanja but this default behavior seems does not handle this scenario
Received user message 'please add buffalo, ranch,mustard, barbeque sauces' with intent '{'name': 'inform', 'confidence': 0.8117992877960205}' and entities '[{'entity': 'sauces', 'start': 11, 'end': 43, 'value': 'buffalo, ranch,mustard, barbeque', 'extractor': 'DIETClassifier'}]'

Here we want to get get them as four separate “sauces” entities i.e. ‘buffalo’ ‘ranch’, ‘mustard’ and ‘barbecue’ instead of one single entity as ‘buffalo, ranch,mustard, barbeque’.

How can we achieve this? Thanks

Tanja · July 13, 2020, 7:48am

@noman You are right, separating entities by comma does not work at the moment. Will create a fix for it soon. For now please use and.

Topic		Replies	Views
Difficulty Extracting multiple entitiy values as multiple entities of the same entity in a single message Rasa Open Source	10	2096	July 16, 2020
Multiple word entity detected as more entities Welcome to the Rasa Community Forum!	0	659	October 27, 2021
Training new entities independent of intent Rasa Open Source	16	4017	October 9, 2018
Recognise entities that contain more than one word Rasa Open Source	6	2701	August 15, 2019
Multiple Entity Detection Problem Rasa Open Source	3	697	May 25, 2020

Unable to classify multiple examples of the same entity. Please help

Related topics