I want to use lookup table to be able to extract entities. However I cannot get it working. Only the entities that are present in my training data are extracted and not those that are available in the lookup table. Consider a simple example I created to illustrate the issue. This is the content of nlu.md(with the lookup table):
## intent:greet
- Hello
- Hi
- Good afternoon
## intent:inform_user_name
- My name is [Mike](user_name)
- I'm [Sarah](user_name)
- Call me [Bob](user_name)
## lookup:user_name
data/lookup_tables/user_names.txt
This is the content of user_names.txt:
Robert
Arnold
Stephanie
Caroline
Mike
Sarah
Bob
It has a simple UserNameForm to extract the name of the user:
class UserNameForm(FormAction):
def name(self) -> Text:
return "name_form"
@staticmethod
def required_slots(tracker: Tracker) -> List[Text]:
return ["user_name"]
def submit(
self,
dispatcher: CollectingDispatcher,
tracker: Tracker,
domain: Dict[Text, Any],
) -> List[EventType]:
return []
This is my stories.md:
## happy path
* greet
- name_form
- form{"name": "name_form"}
- form{"name": null}
- utter_greet_user
domain.yml looks like this:
intents:
- greet
entities:
- user_name
slots:
user_name:
type: unfeaturized
auto_fill: false
forms:
- name_form
responses:
utter_greet_user:
- text: "Hey {user_name}! How are you?"
utter_ask_user_name:
- text: "What is your name?"
session_config:
session_expiration_time: 60
carry_over_slots_to_new_session: true
And finally this is config.yml: language: en
pipeline:
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 100
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
policies:
- name: MemoizationPolicy
- name: TEDPolicy
max_history: 5
epochs: 100
- name: MappingPolicy
- name: FormPolicy
If you run it, you can type “My name is Mike” and since Mike is in the training data it will be extracted without a problem but when you type “My name is Arnold”, where Arnold is not in the training data it will not be extracted even though it is in the lookup table.
Can anyone please tell me what I did wrong? I suspect that I might to change my configuration, maybe add some different pipelines, but I’m not sure.