Only handle specific entity values

Let’s say user can ask my bot where a room is. There is a specific list of available rooms. The room a user is asking for should be recognized as an entity and be saved as a slot if the room is available. If the room isn’t available, the bot should tell the user so.

How should I train the bot and what pipeline should I use?
Should the entity room be a lookup table and the slot type text? Or should it be a categorical slot with values of the lookup table? Should unavailable rooms in utterances in the nlu data also be marked as a room entity or only the values from the lookup table?

In the end I only want to have a room slot with a valid value (available room) and the bot should be able to react appropriately if the room isn’t available.

Hi @leondroidgeeks, this is something you could handle in a custom action after extracting the entity regardless of availability. You could use DIETClassifier to extract the room entities (if it is a custom entity), or, if they’re all numbers like room 123, you could let DucklingEntityExtractor handle it. If it is a custom entity, a lookup table or regex feature could certainly help in extracting the value reliably, but this wouldn’t have any relation to whether or not the room was available.

Then your first custom action after receiving a where is room intent could be one that checks the availability of the room based on the entity value, and if it is available, returns a SlotSet event like [SlotSet("room_number",<extracted_number>)], or [SlotSet("room_number",None)] if the room is not available.

Whether this is a categorical or Text slot depends on what you want to do with it from there; it sounds like making it a Text slot would make sense, since then it would only matter whether the slot was set or not, not what value (i.e. room number) it is.

Hi Melina, thank you for you answer! By “available” I ment if the room exists in that building, not if that room is available for booking etc.
So to only recognize specific room entities, a lookup table is not enough? I still need to check using a custom action if that room exists?

For example:

I want this sentence to trigger the following story:
“Where is room Berlin”

    * find_room{"room": "Berlin"}
        - slot{"room": "Berlin"}
        - utter_show_room

⟹ utter_show_room: “The room Berlin is here”
Whereas the following sentence with a non existing room (= not recognized entity) should trigger a “Room not available”
“Where is room gibberish”

* find_room
    - utter_no_room

It should not trigger this story:

    * find_room{"room": "gibberish"}
        - slot{"room": "gibberish"}
        - utter_show_room

⟹ utter_show_room: “The room gibberish is here”

The principle here is that NLU is for extracting what the user meant - and they meant that Gibberish was a room. Here’s some explanation:

If your training data does not contain any examples of some room, and it isn’t in your lookup table, then it’s not that likely that it will be extracted as an entity.

However, entity extraction works based on both context and the value of the word itself. So a word like “Gibberish” appearing in the exact same spot as usually is a legitimate room name, makes it more likely that it will be extracted as an entity.

In addition, if you have multiple buildings with different sets of valid rooms, the training data for each would be very similar, except the values of the room names. In this case it would be very unreliable to use entity extraction to differentiate, since the contexts would be identical.

Even if you had only one building, and only one set of valid rooms, it would be much more reliable to programmatically check if a room is present than rely on entity extraction only. Then your stories would look like this:

* find_room{"room": "Berlin"}
        - slot{"room": "Berlin"}
        - action_check_room<!--action checks room, sets slot "valid_room" because it is valid-->
        - slot{"valid_room": "Berlin"}
        - utter_show_room

* find_room <!--this represents no room value being extracted at all-->
    - utter_no_room <!--this could remain the same since no slot is set at all-->

* find_room{"room": "gibberish"}<!--this represents an in-valid room value being extracted -->
        - slot{"room": "gibberish"}
        - action_check_room<!--action checks room, sets slot to None because it is in-valid-->
        - utter_show_room

Thank you very much for you advice, Melinda! I think I got it.

One last question:
In a sentence like “Where is room gibberish”, should I mark gibberish as entity room or should I delete this classification when going through my NLU Inbox? And do I even need a lookup table anymore?

You don’t need to annotate things that aren’t rooms at all, only legitimate ones. As to whether to keep some specific value when annotating, it’s a question of whether you want to encourage the model to predict it as an entity in the future.

The lookup table could still help your model predict entities since it is just providing an extra feature to the classifier that says, “A word in the lookup table for this entity was found here”.

1 Like