Valid Custom Pipeline?

ShayVD · February 12, 2020, 12:49pm

Hi I was testing out some custom pipelines, namely trying to put the convert featurizer in with the supervised embeddings pipeline, and was wondering if the format of the pipeline is correct or not?

Configuration for Rasa NLU.

# https://rasa.com/docs/rasa/nlu/components/
language: en
pipeline:
  - name: WhitespaceTokenizer
  - name: ConveRTTokenizer
  - name: RegexFeaturizer
  - name: CRFEntityExtractor
  - name: EntitySynonymMapper
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  - name: ConveRTFeaturizer
  - name: EmbeddingIntentClassifier
    epochs: 300
    embed_dim: 20

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
  - name: FormPolicy
  - name: MemoizationPolicy
  - name: MappingPolicy
  - name: FallbackPolicy
    nlu_threshold: 0.2
    core_threshold: 0.2
  - name: EmbeddingPolicy               # Recurrent Embedding Dialogue Policy, uses RNN for dialogue management
    epochs: 150
    max_history: 50  
    batch_size: [32,64]
    featurizer:
      - name: MaxHistoryTrackerFeaturizer
        state_featurizer:
        - name: LabelTokenizerSingleStateFeaturizer
    augmentation_factor: 0

dakshvar22 · February 18, 2020, 12:06pm

Hi, what version of Rasa are you using? If you are using Rasa 1.7.0 and onwards, you don’t need to WhitespaceTokenizer in there.

ShayVD · February 19, 2020, 9:50am

Yes I am using 1.7.0. I’ve removed WhitspaceTokenizer and the CountVectorsFeaturizers without any impact on accuracy. The accuracy is at 0.999 but the loss never goes below 0.7. Is this a worry, or does it not matter since accuracy is so high?

dakshvar22 · February 21, 2020, 12:04pm

It will be helpful if you create a separate test data split and evaluate the model accuracy on that split. Use rasa data split nlu

Topic		Replies	Views
Rasa Pipeline Doubt Rasa Open Source	2	468	June 24, 2020
How to use ConveRT Featurizer in Windows Rasa Open Source	8	982	June 8, 2022
WhitespaceTokenizer ignored from pipeline Rasa Open Source	0	355	April 17, 2022
Config File for Indic language Rasa Open Source	0	149	October 3, 2023
Custom sentence embedding component Rasa Open Source	0	753	May 8, 2022

Valid Custom Pipeline?

Configuration for Rasa NLU.

Related topics