Pattern for lookup tables


It looks like I have the same problem.

Using the RegexFeaturizer and Diet gives really bad results for my Entities in lookup tables. However, I have a lot of training sentences containing examples that are in the lookup tables (more than 200) the entities that are not specified in training sentences are mostly not recognized.

I tried to open a topic about that (here) but never had a relevant answer.

I can share with you 200 tests comparing RegexFeaturizer + CRFEntityExtractor with RegexFeaturizer + DIETClassifier