I have a slot called city, which could take on over 100 values. My goal of the chat bot is to input: “What is the weather in Phoenix” and output the weather in Phoenix. From what I understand, to cover the case of over 100 cities like Phoenix, I would have to create an intent for each of the 100+ cities. Each intent would contain a line like: “What is the weather in slot_value?” (replacing slot_value with a city name i.e. Dallas, Los Angeles, New York, etc.). As you can see, the trigger words will be the same, but I don’t want to repeat the same trigger phrase across 100+ cities. Additionally, I would have to create 100+ stories, each containing the unique intent corresponding to the input city name (given my action is the same regardless of the city). Is my understanding correct, or is there a simpler way so that I don’t have to define 100+ stories and 100+ intents that will have a lot of redundant information? Perhaps a place where I can define all the categories in one place and define the intent phrases/ stories only once? I appreciate any help, and thanks.
This blog post might help you.
Once you have a well-defined lookup table, you wouldn’t need 100+ examples, as it will learn from a smaller sample set (~20 unique examples is ideal, but it can be lesss) and use the lookup table for reference.
I believe the only case where you’d need to have each value mapped is with synonyms, described here.
You can have only one intent called: get_weather. The city name must be an entity/slot. Based on the slot value you can output the weather, that’s the whole point. Intent is the same but what you output is based on the extracted slot.
Your nlu file must contain a few examples like this:
- What is the weather in [Phoenix](city)
- I would like to know the weather in [Chicago](city)
- Weather in [Boston](city)
keep in mind that with over 100 possible options for the slot to be filled, one could come up with the idea of automatically generate those sample sentences by replacing the entities linewise. I can recommend favoring a lookup table over those automatically generated content since the models tend to overfit then.
I made a recent experiment with ~2500 sentences covering ~25 entities of type “project” and the evaluation didn’t look that good.
Just to let you know.
Then for the stories, I would have to define 100+ stories that each fills a slot for each extracted entity. This still seems like a lot of automatic code generation that could lead to over fitting on the data.
Thanks for letting me know. I will definitely try a lookup table. Sounds like a more viable option that automatically generated content.
Thank you for the references.
You don’t need 100+ stories. Use the inbuilt Spacy entity PLACE. The place is a slot. You can trigger a custom action and do what you want.
I still don’t get why you need so many stories?
You’re right. There’s no need for so many stories. Thanks for the help