I am having trouble with lookup tables. I am using the python (3.6.5) API. It seems as if the model is either not finding the lookup table files or not able to read its values. When I look into the model folder after training, there are no files corresponding to the look-up tables or any sign of the lookup table values. The model picks up entities that are included in the training data but it does pick up the entities that are only in the lookup tables.
My nlu_config.yml file contains the following configuration
language: "en"
pipeline:
- name: "tokenizer_whitespace"
- name: "intent_featurizer_count_vectors"
- name: "intent_entity_featurizer_regex"
- name: "ner_crf"
features: [
["low", "title", "upper"],
["bias", "low", "prefix5", "prefix2", "suffix5", "suffix3",
"suffix2", "upper", "title", "digit", "pattern"],
["low", "title", "upper"]
]
- name: "intent_classifier_tensorflow_embedding"
I have a data folder that contains the data.md file and the lookup table text file (newline separated) with 21 different locations. I’ve included 8 of the 21 locations in the data.md file. I’ve referenced the lookup table in the data.md file with the following lines:
## lookup : location
data/location.txt
when I train the model with the following python code (after installing the packages):
training_data = load_data('data/data.md')
trainer = Trainer(config.load("nlu_config.yml"))
trainer.train(training_data)
model_directory = trainer.persist('projects/')
the resulting model does not pick up on any of the location entities that are in the lookup table.
I seem to be out of ideas to make this work. Can anyone spot anything?