Slot unexpectedly filled with an array when an entity is detected / extracted multiple times

digitalWestie · October 21, 2020, 11:16am

I have Rasa 2.0 (from RasaHQ Docker image) set up with the following pipeline in config:

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en
pipeline:
  - name: SpacyNLP
    model: "en_core_web_sm"
  - name: SpacyTokenizer
  - name: custom_components.SimpleNameExtractor
  - name: SpacyEntityExtractor
    dimensions: ["PERSON"] #https://spacy.io/api/annotation#section-named-entities
  - name: SpacyFeaturizer
    pooling: mean
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

I’m using Spacy in combo with my own custom component to extract names. My custom component catches the unusual names Spacy doesn’t manage to get. Everything works as expected.

At times both components are able to extract a PERSON entity. This is great confirmation, but what happens is the name slot is filled with an array rather than a single name string. For example:

Input: “My name is Albert”

Gives output:

{
  "text": "My name is Albert",
  "intent": {
    "id": -4074871007728436796,
    "name": "inform",
    "confidence": 0.99996018409729
  },
  "entities": [
    {
      "value": "albert",
      "confidence_entity": 0.6000000000000001,
      "entity": "PERSON",
      "start": 11,
      "end": 17,
      "extractor": "simple_name_extractor"
    },
    {
      "entity": "PERSON",
      "value": "Albert",
      "start": 11,
      "confidence": null,
      "end": 17,
      "extractor": "SpacyEntityExtractor"
    }

...

When I use interactive story builder, it fills my name slot like so:

show_name_form 1.00                                                                                                  
      active_loop{"name": "show_name_form"}                                                                                
      slot{"requested_slot": "name"}                                                                                       
      What's your name? Or nickname if you prefer?                                                                         
      slot{"name": ["albert", "Albert"]}                                                                         
      slot{"requested_slot": null}                                                                                         
      active_loop{"name": null}                                                                                            
      utter_greet_name 1.00                                                                                                
      Nice to meet you ['albert', 'Albert']!

I suppose I am missing a step to determine which entity should be selected when more than one component identifies an entity. I’m not sure where to do this, could someone point me in the right direction?

Edit: For further clarification the name slot is set in form like so:

forms:
  show_name_form:
    name:
    - type: from_entity
      entity: PERSON
      not_intent:
      - greet
      - out_of_scope
      - clarification

digitalWestie · October 21, 2020, 11:39am

I had a read through the code, it looks like this is the logic that fills the slot with an array of entities or single value (or none if none found):

github.com

RasaHQ/rasa/blob/0f94645f1a84c745e6131674a9c9fc3404608544/rasa/core/actions/forms.py#L251


        role: optional entity role of interest
        group: optional entity group of interest

    Returns:
        Value of entity.
    """
    # list is used to cover the case of list slot type
    value = list(
        tracker.get_latest_entity_values(name, entity_group=group, entity_role=role)
    )
    if len(value) == 0:
        value = None
    elif len(value) == 1:
        value = value[0]
    return value

def extract_other_slots(
    self, tracker: DialogueStateTracker, domain: Domain
) -> Dict[Text, Any]:
    """Extract the values of the other slots
    if they are set by corresponding entities from the user input

I guess I didn’t expect this. I’m assuming I can override this in my own form?

digitalWestie · October 21, 2020, 1:17pm

Success! I ended up adding a custom validation action to select a single value. Ideally it would select the value with the highest confidence score found in tracker.latest_message['entities']

This is first time I’ve created a custom action and a component. Does this method make sense to everyone?

class ValidateShowNameForm(FormValidationAction):
    def name(self) -> Text:
        return "validate_show_name_form"

    def validate_name(
        self,
        slot_value: Any,
        dispatcher: CollectingDispatcher,
        tracker: Tracker,
        domain: DomainDict,
    ) -> Dict[Text, Any]:

        if type(slot_value) is list:
          slot_value = slot_value[-1]
          logger.info("Multiple possible values extracted for name value, using last in pipeline")
          logger.info(json.dumps(tracker.latest_message['entities']))

        return {"name": slot_value}

Juste · November 9, 2020, 1:25pm

Hi @digitalWestie. Thank you for sharing your solution here, I am sure it will be very useful for other community members.

harshit-sysquo · March 16, 2021, 11:19am

I used @digitalWestie 's approach and added a piece of code to extract the entity with highest confidence:

        maxConf=0.0
        val = ""
        if type(slot_value) is list:
            for entity in tracker.latest_message['entities']:
                if entity['confidence_entity']>=maxConf:
                    maxConf = entity['confidence_entity']
                    val = entity['value']
            slot_value = val

Hope this helps.

Vin · August 3, 2021, 8:14am

@harshit-sysquo Thanks Although it does not work when one of the extractors is RegexEntityExtractor (lookup tables) which has no confidence values.

digitalWestie · August 6, 2021, 9:52am

Nice one

Topic		Replies	Views
Multiple entity extractors in pipline Rasa Open Source	5	2396	July 19, 2021
How to specify an entity extractor to extract only specific entities Rasa Open Source	3	343	February 8, 2024
Entity named recognition with spacy Rasa Open Source	7	3013	January 23, 2022
Using the same entity for different values with spacy Rasa Open Source	3	428	July 21, 2020
Problem while storing the slot value Rasa Open Source	3	604	July 7, 2021

Slot unexpectedly filled with an array when an entity is detected / extracted multiple times

Related topics