Regexp entity email extraction

I want to create a forgot password bot as learning exercise

With this regexp

  "rasa_nlu_data": {
    "regex_features": [
      {
        "name" : "email",
        "pattern" : "[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}"
      },
      {
        "name": "zipcode",
        "pattern": "[0-9]{5}"
      },

I receive this error while training

ValueError: Unknown data format for file NLU/nlu.json

I noted in particular that is the use of

\ .

is a problem for json format

Questions

  1. Any help on how to solve this and track email in the sentences?
  2. Also, what is the difference of use .md versus .json format for NLU training

Thanks

hey @linediconsine,

1)You can use Duckling entity extractor to get email-id , you can get the details how to do so @

https://rasa.com/docs/nlu/0.13.8/components/#ner-duckling-http

2)You can get the amswer of your 2nd question here :

https://rasa.com/docs/nlu/0.13.8/dataformat/#data-format

HI Jitesh, I really appreciate your help.

About the components,

  1. Can I position a duckling before or after anything in the pipeline or there is a logic I should follow?
  2. How can I test if duckling or in general the pipeline is working? ( maybe a this is a stupid question)

Right now my pipeline is :


language: "en"

pipeline:
- name: "ner_duckling_http"
 # stack from https://haskell-lang.org/get-started/osx
 # https://rasa.com/docs/nlu/components/#ner-duckling-http
 # https://github.com/facebook/duckling#quickstart
 # url of the running duckling server
 url: "http://localhost:8000"
 # dimensions to extract
 dimensions: ["email", "time", "number", "amount-of-money", "distance"]
 # allows you to configure the locale, by default the language is
 # used
 locale: "en_US"
 # if not set the default timezone of Duckling is going to be used
 # needed to calculate dates from relative expressions like "tomorrow"
 timezone: "Europe/Berlin"
- name: "nlp_spacy"                   # loads the spacy language model
- name: "tokenizer_spacy"             # splits the sentence into tokens
- name: "intent_featurizer_spacy"     # transform the sentence into a vector representation
- name: "intent_classifier_sklearn"   # uses the vector representation to classify using SVM

Thank you!

Marco

Hi Marco I am also working on a password-reset chatbot with RASA. Are you interested in a knowledge exchange? You can contact me on: fioredar@students.zhaw.ch Greets Dario

Yes, this Sounds amazing! I sent you my contacts,

Speech soon!

You can connect with me at LinkedIn