Detecting multiple regexes as separate entities


I have a few regexes in my Rasa application, it’s able to pick up one of them as an entity but not all of them. Here are the regexes and examples:

product-id --- '\d{7}$'        1234567   
category-id --- '[CA]\d{5}$'   CA12345
product-id --- '[PD]\d{7}$'   PD1234567

Product-id is always picked up, however, if any combination of the other appear in an utterance they’re not detected at all.

How can I get Rasa to pick up all of them?


Hi @qamir ,

could you share your config and nlu examples including these entities?

I’d need some info on what you’re using to be able to help you out.

Regards, Nikola

Here you go:

# Configuration for Rasa NLU.
language: en

# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See for more information.
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: RegexEntityExtractor
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
  - name: FallbackClassifier
    threshold: 0.3
    ambiguity_threshold: 0.1
  - name: "SpacyNLP"
    # language model to load
    model: "en_core_web_md"

    # when retrieving word vectors, this will decide if the casing
    # of the word is relevant. E.g. `hello` and `Hello` will
    # retrieve the same vector, if set to `False`. For some
    # applications and models it makes sense to differentiate
    # between these two words, therefore setting this to `True`.
    case_sensitive: False
  - name: "SpacyEntityExtractor"
    # dimensions to extract
    dimensions: ["PERSON", "LOC"]  
  - name: "DucklingEntityExtractor"
    url: "

    dimensions: ["time", "amount-of-money", "distance", "amount-of-money", "phone-number", "url", "credit-card-number", "email"]
    locale: "en_GB"
    timezone: "Europe/London"
    timeout: 3