How to fix entity detection error in a form

wyq · July 21, 2022, 10:06am

In a form, each time the user types something, there is a intent assigned to it, which is at most time not a problem because I don’t need the intent or the entitiy, the slot is filled by the user text. But sometimes I am facing the ValueError from NLU “Entities have identical start but different end positions” because 1 word is detected twice as part of another word. And this error breaks down the conversation.

My question is: how do I ignore this error inside a form which only fills the slot by text mapping? I dont need the entities.

ltfschoen · December 4, 2022, 11:46am

i was having the same issue when using both DIETClassifier and RegexEntityExtractor at the same time when i provided a response to the formbot example of “italian please” when it asked “What cuisine?”: ValueError("Entities '{'entity': 'feedback', 'start': 0, 'end': 14, 'value': 'italian please', 'extractor': 'RegexEntityExtractor'}' and '{'entity': 'cuisine', 'start': 0, 'end': 7, 'confidence_entity': 0.9998438358306885, 'value': 'italian', 'extractor': 'DIETClassifier'}' have identical start but different end positions")>

What’s confusing is that Rasa docs here https://rasa.com/docs/rasa/nlu-training-data#regular-expressions-for-rule-based-entity-extraction say that:

when using lookup tables with RegexEntityExtractor , provide at least two annotated examples of the entity so that the NLU model can register it as an entity at training time regular expression and at least two annotated examples in your training data

whereas here Components it says:

Make sure to annotate at least one example per entity

so when i checked my nlu.yml file it was missing an intent for the entity “feedback”, which was what was capturing my response, so i added it:

- intent: feedback
  examples: |
    - yes please
    - italian please

but then elsewhere it provides a warning that if you’re using both DIETClassifier and RegexEntityExtractor it may lead to duplicate/overlapping extraction Components, so they say:

If you use multiple entity extractors, we advise that each extractor targets an exclusive set of entity types

and they also provide two options:

ltfschoen · December 4, 2022, 12:10pm

i was able to resolve the error by following their Option 2 suggestion by adding more annotations to nlu.yml, and also updating my pipeline to only use RegexFeaturizer, but removing RegexEntityExtractor and using only DIETClassifier instead to avoid conflict between those, but i still got a conflict between DIETClassifier and DucklingEntityExtractor, when i responded to “How many people?” with “5 people”, which i resolved by removing DucklingEntityExtractor… but then in the slot it tries to capture slot{"num_people": "5 people"} but the goal is for it to capture slot{"num_people": "5"} instead so it sets the slot value to null and gives error `Number of people should be a positive integer, please try again"

Topic		Replies	Views
RegexEntityExtractor Slot filling not working in Rasa 3.x Rasa Open Source	1	379	October 28, 2022
Misaligned entity annotation Rasa Open Source	7	4591	June 3, 2020
Regex entity extractor generated a incomplete report Rasa Open Source	5	951	December 16, 2021
Entities with punctuation and space are not recognized Rasa Open Source	4	547	March 3, 2021
Regex: Unable to extract correct entity according to Regex Rasa Open Source	4	1610	February 21, 2022

How to fix entity detection error in a form

Related topics