Detecting multiple regexes as separate entities

qamir · March 8, 2023, 10:31pm

Hi,

I have a few regexes in my Rasa application, it’s able to pick up one of them as an entity but not all of them. Here are the regexes and examples:

product-id --- '\d{7}$'        1234567   
category-id --- '[CA]\d{5}$'   CA12345
product-id --- '[PD]\d{7}$'   PD1234567

Product-id is always picked up, however, if any combination of the other appear in an utterance they’re not detected at all.

How can I get Rasa to pick up all of them?

Thanks

NikolaMr · March 9, 2023, 9:32am

Hi @qamir ,

could you share your config and nlu examples including these entities?

I’d need some info on what you’re using to be able to help you out.

Regards, Nikola

qamir · March 9, 2023, 11:34am

Here you go:

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en

pipeline:
# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See https://rasa.com/docs/rasa/tuning-your-model for more information.
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: RegexEntityExtractor
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
  - name: FallbackClassifier
    threshold: 0.3
    ambiguity_threshold: 0.1
    
  - name: "SpacyNLP"
    # language model to load
    model: "en_core_web_md"

    # when retrieving word vectors, this will decide if the casing
    # of the word is relevant. E.g. `hello` and `Hello` will
    # retrieve the same vector, if set to `False`. For some
    # applications and models it makes sense to differentiate
    # between these two words, therefore setting this to `True`.
    case_sensitive: False
  - name: "SpacyEntityExtractor"
  
    # dimensions to extract
    dimensions: ["PERSON", "LOC"]  
    
  - name: "DucklingEntityExtractor"
    url: "http://127.0.0.1



    dimensions: ["time", "amount-of-money", "distance", "amount-of-money", "phone-number", "url", "credit-card-number", "email"]
    locale: "en_GB"
    timezone: "Europe/London"
    timeout: 3

Topic		Replies	Views
Rasa regex Rasa Open Source	5	650	February 23, 2022
Regex based entity Extraction Rasa Open Source	1	1028	April 30, 2020
Entities can't get extracted with regex Rasa Open Source	18	1213	January 18, 2022
Multiple word entity detected as more entities Welcome to the Rasa Community Forum!	0	644	October 27, 2021
Need clarity RASA Regex Rasa Open Source	3	981	September 9, 2019

Detecting multiple regexes as separate entities

Related topics