RegexEntityExtrator and DIETClassifier extracting same intent and not following regex rule?

Hello, I’m trying to extract a phone number pattern using RegexEntityExtractor adding the regex configuration in pipeline this way:

- name: RegexEntityExtractor
    case_sensitive: False
    use_lookup_tables: False
    use_regexes: True
    "use_word_boundaries": True
- name: DIETClassifier
    epochs: 55

This is my nlu intent declaration:

    - regex: regex_phone_number
        examples: |
          - \(?[1-9]{2}\)? ?(?:[2-8]|[9]{0,1}[5-9]{1})[0-9]{3}\-?[0-9]{4}

    - intent: phone_number
        examples: |
          - meu número é [8973542665](regex_phone_number)
          - [61992852776](regex_phone_number)

    - intent: invalid_phone_number
        examples: |
          - 123
          - 00000
          - 111111
          - asdhja
          - aaaaaa
          - telefone123
          - 111111111111111111
          - ldhuahsduashd

What I’m trying to do is to extract phone numbers according to a regex pattern, which is defined in nlu intents with examples. If a number have this pattern, it should follow a path. Otherwise, it should receive a “invalid_phone_number” intent. But when I train and run my project, numbers out of this pattern are extracted for both extractors:

rasa.core.processor - Received user message '**00000**' with intent '{'id': 5831918261946756680, 'name': 'invalid_phone_number', 'confidence': 0.27582165598869324}' and entities '[{'entity': 'regex_phone_number', 'start': 0, 'end': 5, 'confidence_entity': 0.45642849802970886, 'value': '00000', 'extractor': 'DIETClassifier'}]'

How can I do to extract only numbers that follow this pattern, so it won’t accept numbers like “0000”? I already tested this line of regex and it looks fine.

1 Like

@shaysi Have you check these?

  1. NLU Training Data
  2. Components
  3. https://rasa.com/docs/rasa/components#regexentityextractorhttps://rasa.com/docs/rasa/components#regexentityextractor

I will encourage please read the document for Regex and Regular Expression.

Hope this will solve your problem https://youtu.be/tMR1PNe0JB4

Yes, I’ve been working reading these material. I can extract correct examples, but at the same time, the extractor is accepting number options that is not in the regex pattern provided and tested.

@shaysi have you seen and tried the video tutorial?

Check you regex https://regexr.com

Can you share with Screenshot?

Yes, just watched. I’m doing the same as his and I found my error: it was with my regex pattern. Thanks for your help :slight_smile:

1 Like

@shaysi can you close this thread with the solution for others? Thanks it’s good your query solved! If you have any issue do let us know!

1 Like