After reading the blog about improving entity recognition via lookup tables and fuzzy matching / character n-grams, I am a bit confused on how to actually use these positive and negative influencer n-grams.
Based on my understanding, we can use the example scripts documented to generate these n-grams for our look up tables and then it says to add them as a separate lookup tables. Concerns are:
How would training data change based on these n-grams? Would the training data have to have annotations marking the n-grams?
Even if an n-gram is matched, how is it necessarily linked to the actually entity we wanted to match in the first place? For example, the n-gram found “inc” so how does it actually match to “apple inc”?