Training NER model using TensorFlow pipeline

I am trying to train a NER model using RASA which uses Tensorflow pipeline. The usecase which i am working on is basically extracting key entities from Driving License of US. For e.g if a user uploads a DL image then I get the OCR text from the image. Then I push this raw text to my NER model which will return me only the key entities like DLNumber, DOB, DOE, Address etc out of it.

So if some one can help me in suggesting how to make my NER model more robust. So that it returns all the entities for any DL’s of US. As I am facing issues where it returns duplicate entity names for wrong values. For e.g. dl_issue is repeated twice.

{ “intent”: { “confidence”: 0.9285378456115723, “name”: “DL” }, “project”: “default”, “entities”: [ { “start”: 48, “confidence”: 0.9092500118874686, “entity”: “dl_number”, “extractor”: “ner_crf”, “end”: 57, “value”: “999999999” }, { “start”: 63, “confidence”: 0.9973167577546155, “entity”: “dl_dob”, “extractor”: “ner_crf”, “end”: 73, “value”: “04-27-1970” }, { “start”: 146, “confidence”: 0.8391978961972846, “entity”: “dl_postal_code”, “extractor”: “ner_crf”, “end”: 151, “value”: “72203” }, { “start”: 168, “confidence”: 0.4135534135801666, “entity”: “dl_issue”, “extractor”: “ner_crf”, “end”: 178, “value”: “04-27-2010” }, { “start”: 179, “confidence”: 0.5787973704711238, “entity”: “dl_issue”, “extractor”: “ner_crf”, “end”: 189, “value”: “04-27-2014” } ], “model”: “model_20190129-114027”, “intent_ranking”: [ { “confidence”: 0.9285378456115723, “name”: “DL” }, { “confidence”: 0.0, “name”: “AOI” }, { “confidence”: 0.0, “name”: “W9” }, { “confidence”: 0.0, “name”: “Passport” } ], “text”: “ARKANSAS Natural State DRIVER’S LICENSE DL DLN: 999999999 DOB: 04-27-1970 CLASS: D dushin glamcole Susan Sample SAMPLE 123 Easy S: Little Rock AR 72203 issued Expirese 04-27-2010 04-27-2014 Height Eyes: 5-8 BR Endors Restri B ORGAN DONOR” }

Hello, I’m working on same project as you. What did you use for ocr extraction ?