Pattern for lookup tables

Hi @lgabs, if you have the RegexFeaturizer in your pipeline and the assistant is training without any errors, the lookup tables are used. We don’t log if they are successfully used, we only log is something went wrong.

Regarding the usage: As pointed out in our documentation you need to include some examples in your training data that use an entry from the lookup table. The entries in the lookup tables are converted into features. Let’s take a look at an example:

“Hello my name is Tanja.”

Assume you have a lookup table for names and that one includes the word “Tanja”. What the RegexFeaturizer is then doing, it looks if it can find an entry from the lookup table in the text. If it finds a match, it sets a feature that basically says that “Tanja” is listed in a lookup table. The features are then passed on to the model, for example, DIETClassifier, and the model is then hopefully learning the correlation between “Tanja” is listed in the lookup table and “Tanja” being an entity of type name. If you don’t include any examples in the training data that contain an entry of the lookup table, the model is not able to learn this correlation as it will not be present. So it is important that you include some entires from your lookup table in the training data and mark them as entity.

1 Like