Look up tables are not working on unseen data samples

Following is the pipeline am using to train NER model

Configuration Pipeline:

language: “en” pipeline:

  • name: “SpacyNLP”
  • name: “tokenizer_spacy”
  • name: “ner_crf”
  • name: “ner_synonyms”
  • name: “CRFEntityExtractor”
  • name: “intent_entity_featurizer_regex”

That’s how I add my lookup table of technical skills which contain words. Training data:

{
    "rasa_nlu_data": {
        "common_examples": [{[... ],
        "lookup_tables": [{
            "name": "technical_skills",
            "elements": "data/tech_skills_lookup/technical_skills.txt"
        }]

Regex model after training component_5_RegexFeaturizer.pkl

[
    {
        "name": "technical_skills",
        "pattern": "(?i)(\\bpytorch\\b|\\br\\b|\\bmachine\\ learning\\ frameworks\\b|\\bsentiment\\ analysis\\b|\\bdata\\ structures\\b|\\bn\\-grams\\b|\\bpython\\b|\\btext\\ representation\\ techniques\\b|\\bjava\\b|\\bkeras\\b|\\bbag\\ of\\ words\\b|\\bsemantic\\ extraction\\ techniques\\b|\\bmodeling\\b|\\bbig\\ data\\b|\\bibm\\ cloud\\b|\\bamazon\\ alexa\\b|\\bmicrosoft\\b|\\bc\\#net\\b|\\bnodejs\\b|\\bgoogle\\ dialogflow\\b|\\bibms\\ watson\\ conversation\\ service\\b|\\bamazon\\b|\\bazure\\b|\\bpython\\b|\\b\\b|\\bscala\\b|\\bapache\\ lucene\\b|\\bapache\\ spark\\b|\\bnumpy\\b|\\bcorenlp\\b|\\bcomputer\\ science\\b|\\bapache\\ opennlp\\b|\\btextblob\\b|\\bmllib\\b|\\bspacy\\b|\\bscikit\\-learn\\b|\\bgensim\\b|\\bsolr\\b|\\bpandas\\b|\\bpython\\b|\\bnltk\\b|\\bscipy\\b|\\bglove\\b|\\bkeras\\ pytorch\\b|\\bmachine\\ learning\\b|\\btensorflow\\b|\\br\\b|\\bword2vec\\b|\\bmathematics\\b|\\bdata\\ cleaning\\b|\\bwrangling\\b|\\brnn\\b|\\bforecast\\ modeling\\b|\\bword\\ embedding\\b|\\btensorflow\\b|\\bkeras\\b|\\bsequence\\ modeling\\b|\\bcnn\\b|\\bfeature\\ engineering\\b|\\bnips\\b)"
    }
]

But I don’t know why it does not able to recognize this ? Example code output:

interpreter.parse("nips")
{'intent': {'name': None, 'confidence': 0.0}, 'entities': [], 'text': 'nips'}

Seems like look up tables are working although they are the part of my model. Strange behavior!

Please help me out

@Nomiluks sorry about the late reply. Lookup tables are use for entities, it looks like your intent isn’t getting classified though. You’ll need to add out of vocabulary handling to the intent classifier.

What kind of training examples do you have for the technical skills entities in your sentence examples?

Thanks for replying. I am actually using look up tables for entities not for intents.

I have attached a sample file that I have created.

sample_data.json (2.7 KB)