Wrong Mapping: Right intent but wrong entities

Hey Guys, So I am only using rasa_nlu, I have enough training examples approx 20 for each intent and entities. These examples are primarily the output of linux shell commands. My issue is that my model classifies the correct intent say “intent1” but maps it with different entity name though the entity value is correct. For example: suppose there are 2 intents “intent1” and “intent2” with respective entities “entity1” and “entity2” with values “1234” and “5432” respectively. so the correct output should be intent1:{entities:[“entity1”=“1234”]} intent2:{entities:[“entity2”=“5432”]} but instead it comes out be intent1:{entities:[“entity1”=“5432”] or intent1:{entities:[“entity2”=“1234”]} How can I resolve this? I have looked into domain.yaml but that is for rasa core…furthermore can this be resolved by using custom component… This is my pipeline language: “en_core_web_md” pipeline:

  • name: “SpacyNLP”
  • name: “SpacyTokenizer”
  • name: “SpacyFeaturizer”
  • name: “CRFEntityExtractor”
  • name: “SklearnIntentClassifier”

Hi @GTroxx,

It could be that your entities are too similar. There is a discussion and some suggestions in this forum post: Similar Entity Extraction

Would you be able to share an example of your training data on the forum or perhaps through github? That would allow for a more detailed evaluation of your issue.

here is my sample training data for get disk

    "text": "77      Virtual HD                                                                                                   Healthy              Online                       500 GB MBR       ",
    "intent": "getDisks",
    "entities": [
      {
        "start": 0,
        "end": 2,
        "value": "77",
        "entity": "diskNumber"
      },
      {
        "start": 8,
        "end": 18,
        "value": "Virtual HD",
        "entity": "friendlyName"
      },
      {
        "start": 117,
        "end": 124,
        "value": "Healthy",
        "entity": "diskHealthStatus"
      },
      {
        "start": 138,
        "end": 144,
        "value": "Online",
        "entity": "diskOperationalStatus"
      },
      {
        "start": 167,
        "end": 173,
        "value": "500 GB",
        "entity": "diskSize"
      },
      {
        "start": 174,
        "end": 177,
        "value": "MBR",
        "entity": "partitionType"
      }
    ]
  },

and for get process

“text”: “2532 94 0.03125 0.001608712 conhost”,

    "intent": "getProcess",
    "entities": [
      {
        "start": 0,
        "end": 4,
        "value": "2532",
        "entity": "processID"
      },
      {
        "start": 10,
        "end": 12,
        "value": "94",
        "entity": "processHandles"
      },
      {
        "start": 14,
        "end": 21,
        "value": "0.03125",
        "entity": "cpu(%)"
      },
      {
        "start": 23,
        "end": 34,
        "value": "0.001608712",
        "entity": "memory(%)"
      },
      {
        "start": 35,
        "end": 42,
        "value": "conhost",
        "entity": "processName"
      }
    ]
  },

it’s confusing between disknumber and processid

hey so I have already shared my training data… I was wondering that would I be able to solve this problem by using domains.yml of Rasa Core where I mention which entities to use for what intent. which will depend on the process of entity value and name mapping so if I mentioned in my domain

intents:

- getProcess:
     use_entities: 
        -processID
- getDisk:
        use_entities:
            -diskNumber

Can I use rasa core only for custom outputs cause i don’t need the dialogue engine.

Hi @GTroxx,

I was thinking that for your use case it might be better not to use entity extraction to get these values.

Instead I recommend to use a custom action that extracts the values from the string, and then store the extracted values in slots.

Hi! I ran into the same problem. So far I assumed that if the intent has an entity with a unique name, the entity extractor will try to extract only the entity of this specific intent which has been predicted (and then will fill the slot with the same name). But as I realized and read in @GTroxx post, entity extractor can extract entity that is even not present in nlu training data of this intent. In my case, I have:

NLU training data:

## intent:send_message
- send message to [Will Smith](recipient)
- etc..

## intent:play_artist
- play [Bon Jovi](artist)
- etc..

When user input is “play Johnny Cash”, intent play_artist is predicted but the entity recipient is extracted and filled into slot recipient. I’m worried that in this case, extracting the value from a string in custom action is not possible since the user input may vary e.g. “please play artist Johnny Cash”, “I wanna play songs by Johnny Cash” etc. On the other hand, entity extractor is quite powerful in these cases, unfortunately, it extracts the value into the wrong entity.

@Arjaan, is there another way how to solve this? E.g. force entity extractor to extract entity only from predicted intent?

Hi @lukasch, In your case, I would not annotate the entities, but use spacy to extract the entity PERSON .

Your NLU training data just looks like this:

## intent:send_message
- send message to Will Smith
- etc..

## intent:play_artist
- play Bon Jovi
- etc..

You can specify this in your config.yml

language: en
pipeline:
  ...
  - name: "SpacyNLP"
    model: "en_core_web_md"
  - name: "SpacyEntityExtractor"
    dimensions: ["PERSON"]

The entity PERSON will be extracted in both cases, and you then use that entity in your responses or in your custom actions. For example, you can create two separate responses in your domain.yml:

responses:
  utter_send_message_to_recipient:
  - text: Hi {PERSON}, I will send you a message
  utter_play_song_of_artist:
  - text: playing song of {PERSON}

Thank you for your advice, @Arjaan! Using spacy with PERSON sounds great for messages intent and I will definitely try it. But I’m afraid that I can’t use it for play_artist intent since the artist’s name doesn’t have to be a person’s name (e.g. Queen, Rage Against the Machine etc.).

Hi @lukasch,

That is a great point.

You can use both Spacy and also annotated sentences for DIETClassifier. You can do this, because you can use multiple entity extractors at the same time.

I see 2 possible solutions:

Option 1: use only the entity/slot PERSON

Just treat the entity/slot PERSON for both real person names and band names. You can rely on Spacy to extract real person names, and you must annotate sentences for band names, like:

## intent:play_artist
- play [Rage against the machine](PERSON)
- etc..

Option 2: use a entity/slot PERSON and a entity/slot artist

You must annotate sentences for band names, like:

## intent:play_artist
- play [Rage against the machine](artist)
- etc..

but you can also still rely on spacy to extract PERSON entities, and use a custom action to map the PERSON entity onto the artist slot

1 Like

@lukasch,

A great 3rd option would be to use a form to handle the play_artist intent.

What if your user asks your bot can you play some music?. The bot then first needs to ask for the artist to play, maybe also what song or what album. And, for Rage against the machine, at what volume!

The forms are designed to handle those type of dialogs, where you need to collect information from the user.

Also, forms have a nice mechanism build-in to map different entity types onto a slot.

1 Like

Thanks a lot for your suggestions, @Arjaan! It seems that Option 2 will fit best my situation.

I was also thinking of using form but I tried to make the bot more “clever” first, that it will recognize the artist from the initial user input. But using form is great for fallback when no entity will be extracted e.g. in case user asks can you play some music? as you mentioned. :slight_smile:

One last question just for curiosity. If Spacy entity PERSON would be used to annotate bands in play_artist intent:

## intent:play_artist
- play [Rage against the machine](PERSON)

then it would probably extract Rage against the machine as PERSON entity also from user’s input “send message to Rage against the machine”, right?

Hi @lukasch,

Yes, it would always be extracted, no matter what the intent is.