Has anyone successfully implemented strict regex patterns for entity extraction?

Hi everyone, I see a lot of posts of people struggling to use regex expressions for entity extraction. I am trying to use a regex expression for recognizing an id number for customers. However, it can only recognize those patterns that are very similar to my training data in NLU. this is my pipeline:

  • name: WhitespaceTokenizer
  • name: RegexFeaturizer
  • name: LexicalSyntacticFeaturizer
  • name: CountVectorsFeaturizer
  • name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
  • name: CRFEntityExtractor
  • name: DIETClassifier epochs: 100 constrain_similarities: true entity_recognition: False
  • name: RegexEntityExtractor
  • name: EntitySynonymMapper
  • name: ResponseSelector epochs: 100 constrain_similarities: true

and an example of my training data (over 50 entries):

- intent: id_search
  examples: |
    - search with id [LA5678123](id_number)
    - [AN2915998](id_number)
    - Retrieve records based on the id [823RQ042345](id_number)
- regex: id_number
  examples: |
    - ^[A-Z\d]{9,12}$

Please if anyone has any input on how to get the model to extract the id number as long as it matches the regex expression that would be extremely helpful.

Try: \b\w{9,12}\b