Slot containing .net

How to handle a slot if it starts with . like in .net?

UserWarning: Misaligned entity annotation in message ‘Have you ever programmed in .net’ with intent ‘word2url’. Make sure the start and end values of entities ([(28, 32, ‘.net’)]) in the training data match the token boundaries ([(0, 4, ‘Have’), (5, 8, ‘you’), (9, 13, ‘ever’), (14, 24, ‘programmed’), (25, 27, ‘in’), (29, 32, ‘net’)]). Common causes:

  1. entities include trailing whitespaces or punctuation
  2. the tokenizer gives an unexpected result, due to languages such as Chinese that don’t use whitespace for word separation


Hi @joggerjoel would you able to share the contents of your config.yml file? I would like to double-check the tokenizer(s) and entity extractor you’re using.

Also where does this user warning appear? Which rasa command(s) did you use?

Finally, have you tried running rasa shell --debug to check if your entity gets extracted as expected during a conversation?

- intent: word2url
  examples: |
    - My programming languages are [C#](language)
    - I enjoy [C++](language)
    - Do you program in [PHP](language)
    - I am new to [Python](language)
    - Do you know [Laravel](language)
    - Have you ever programmed in [.net](language)
    - Can you help me with [HTML](language)
    - How do I learn about [kotlin](language)
    - Are you any good at [C](language)
    - When did you learn [golang](language)
    - Have you worked on [CSS](language)
    - Do you play [minecraft](language

Thanks @joggerjoel this actually appears to be an extract from nlu.yml. Could you please show me the config file?

# Configuration for Rasa NLU.
language: en

# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See for more information.
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  # - name: RegexEntityExtractor
  - name: ResponseSelector
    epochs: 100
    retrieval_intent: chitchat
  - name: FallbackClassifier
    threshold: 0.3
    ambiguity_threshold: 0.1
  - name: "DucklingHTTPExtractor"
    # url of the running duckling server
    url: "http://localhost:8000"
    # dimensions to extract
    dimensions: ["number", "email"]
    # allows you to configure the locale, by default the language is
    # used
    # locale: "de_DE"
    # if not set the default timezone of Duckling is going to be used
    # needed to calculate dates from relative expressions like "tomorrow"
    # timezone: "Europe/Berlin"

# Configuration for Rasa Core.
# # No configuration for policies was provided. The following default policies were used to train your model.
# # If you'd like to customize them, uncomment and adjust the policies.
# # See for more information.
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 100
  - name: RulePolicy
    core_fallback_threshold: 0.4
    core_fallback_action_name: "action_default_fallback"
    enable_fallback_prediction: True