Difficulty Extracting multiple entitiy values as multiple entities of the same entity in a single message

Received user message 'please add buffalo, ranch,mustard, barbeque sauces' with intent '{'name': 'inform', 'confidence': 0.8117992877960205}' and entities '[{'entity': 'sauces', 'start': 11, 'end': 43, 'value': 'buffalo, ranch,mustard, barbeque', 'extractor': 'DIETClassifier'}]'

How can we get them as four separate “sauces” entities i.e. ‘buffalo’ ‘ranch’, ‘mustard’ and ‘barbecue’ instead of one single entity as ‘buffalo, ranch,mustard, barbeque’.

While on the other side when sauces are separated with “and” then it correctly extracts them as different entities.

Received user message 'please add mustard and ranch sauce' with intent '{'name': 'inform', 'confidence': 0.45487943291664124}' and entities '[{'entity': 'sauces', 'start': 11, 'end': 18, 'value': 'mustard', 'extractor': 'DIETClassifier'}, {'entity': 'sauces', 'start': 23, 'end': 28, 'value': 'ranch', 'extractor': 'DIETClassifier'}]'

Thanks.

Hi Noman,

just for my understanding, could you share an example of your training data as well as your configuration file. It’s hard for me to point you in the right direction if I don’t know how your pipeline current generates this output.

Hi @koaning Here is the config file: language: en

pipeline:

  • name: WhitespaceTokenizer
  • name: RegexFeaturizer
  • name: LexicalSyntacticFeaturizer
  • name: CountVectorsFeaturizer
  • name: CountVectorsFeaturizer analyzer: “char_wb” min_ngram: 1 max_ngram: 4
  • name: DIETClassifier epochs: 50
  • name: EntitySynonymMapper
  • name: DucklingHTTPExtractor url: http://duckling:7000 dimensions:
    • sys_time
    • sys_day
    • sys_month
    • sys_year
    • phone-number
    • number
    • amount-of-money
    • distance
    • duration
    • volume
    • ordinal
    • temperature
    • email
    • url
    • time policies:
  • name: KerasPolicy epochs: 300 batch_size: 20 max_training_samples: 300
  • name: TwoStageFallbackPolicy nlu_threshold: 0.58 core_threshold: 0.3 fallback_core_action_name: action_default_fallback fallback_nlu_action_name: action_default_fallback deny_suggestion_intent_name: out_of_scope
  • name: MemoizationPolicy
  • name: FormPolicy
  • name: MappingPolicy

My training data for “inform” intent is :

and for “order” intent is:

Just to check. How many examples do you have for the order intent with entities? Did you share all the examples or just these two?

i have 20 plus but i have shared a few. This post How to extract multiple values for one slot in same intent? will give you a context on this problem and will help you answer with having some context. Thanks

OK clear. Then it means that we can start thinking about ways to help out our entity detection algorithm.

One thing that applies here is that the “sauces” entities are probably enumerable. As in; there’s a set of them. This suggests that you might be able to get better results if you add a lookup table. The idea is that you can generate a feature for the pipeline to detect if there’s a match between a predefined list of sauces. This should help the entity detection algorithm a fair bit.

Have you tried this? It might help.

@koaning i do understand some of you in the last comment but i am lost from here onwards

The idea is that you can generate a feature for the pipeline to detect if there’s a match between a predefined list of sauces. This should help the entity detection algorithm a fair bit.

Ah my bad. I’ll rephrase it to keep it simpler then. Do you use lookup tables? If you use lookup tables, you’ll be generating features for the machine learning pipeline that help it detect entities. It could be the help you need.

If it turns out to be helpful, I’ll expand on how these lookup tables work with some drawings.

@koaning oh i got you. Yes i do use lookup tables and yes it do generate feature for the pipeline but still that will help only in extracting the entity i believe but how can it be helpful in not merging the same entities into one single entity?

As @Tanja suggested here Unable to classify multiple examples of the same entity. Please help we can handle this scenario with “and” for now, so i am thinking of processing the user phrase/message and replace ‘,’ or space between the sauces with “and” before it goes to the bot this way separate entities will be getting extracted. But i think it can hurt intent prediction. what do you say?

I learned something today, I wasn’t aware that we merge labelled tokens together as a single entity.

What do you do with the entities once they are detected? Are they used in a custom action? If so, you might also be able to put some logic there to seperate the sauces.

1 Like

@koaning i also didn’t know about it :stuck_out_tongue: searched forum then i got to know about it. Anyway , i think its a good idea to separate out entities after a user msg gets processed through nlu first rather than adding an “and” before and then pass it to the bot. Thanks :slight_smile: