Entity recognition problem

Hi All -

I am using rasa-nlu for intent classification and entity extraction. I have trained model with > 50 examples, how ever not all entities are being picked. I am testing with text similar to below but it only picks email id not customer number. Training Example:

{ “text”: “Hi Team, Email address : lee.fairweather@XXXXXX.com Customer # XXXXX34954 Account # XXXX27523 Account Name: lee fairweather Issue: Customer is unable to login to My Portal Screenshot link: Regards, Jaffer S”, “intent”: “liferay”, “entities”: [ { “start”: 123, “end”: 133, “value”: “XXXXX34954”, “entity”: “CUSTOMER_NUMBER” }, { “start”: 81, “end”: 111, “value”: “lee.fairweather@XXXXXX.com”, “entity”: “FAILED_EMAIL” } ] }

Text I am using for testing:

Hi Team, Email Adress : jaylaca@XXXXXX.com Customer # XXXX861753 Account # XXXX23985 Issue: Customer is unable to login to My Portal Regards, Jaffer S

Below is my config: trainer = Trainer(RasaNLUModelConfig({ “language”: “en”, “pipeline”: “tensorflow_embedding”, “path”:model_dir, “data”:data }))

Please help.

Have you tried using a regex to pick up on the customer number? We’ve been discussing it a bit here in this post: Similar Entity Extraction. I think this may be helpful to you.

I have tried with regex as well, below is the regex I am using:

“regex_features”: [ { “name”: “CUSTOMER_NUMBER”, “pattern”: “[0-9]{10}” } ]

It is still not picking. I have changed my pipeline as well to below:

pipeline:

  • name: “nlp_spacy”
  • name: “tokenizer_whitespace”
  • name: “intent_entity_featurizer_regex”
  • name: “intent_featurizer_spacy”
  • name: “ner_crf”
  • name: “ner_synonyms”
  • name: “intent_classifier_sklearn” path: “./auto/models/nlu” data: “./auto/data/data.json” language: “en”

have you tried using ner_duckling_http?

1 Like