Rasa Pipeline Doubt

rahul_namdev · June 23, 2020, 11:04am

Hi ! I’m new to Rasa .I’ve the following Rasa NLU pipeline and it is also the default pipeline provided by rasa .

Configuration for Rasa NLU.

Components

language: en pipeline:

name: WhitespaceTokenizer
name: RegexFeaturizer
name: LexicalSyntacticFeaturizer
name: CountVectorsFeaturizer
name: CountVectorsFeaturizer analyzer: “char_wb” min_ngram: 1 max_ngram: 4
name: DIETClassifier epochs: 100
name: EntitySynonymMapper
name: ResponseSelector epochs: 100

My question is why is CountVectorsFeaturizer mentioned twice? i know about the functionality but didn’t get why are we having 2 CountVectorsFeaturizer. Also is the output of one element of pipeline serves as input to other? Thanks in advance.

chkoss · June 23, 2020, 2:13pm

Hi and welcome to the Rasa community!

Yes, a pipeline consists of a sequence of components which are executed one after another. So the order of the components matters.

The default pipeline indeed uses two instances of CountVectorsFeaturizer . The first one featurizes text based on words (as you can see here, "words" is the default value for analyzer). The second one featurizes text based on character n-grams, preserving word boundaries. We empirically found the second featurizer to be more powerful, but we decided to keep the first featurizer as well to make featurization more robust.

To learn more about pipelines in general, have a look at our docs on Choosing a Pipeline.

rahul_namdev · June 24, 2020, 5:47am

Got it . Thanks .

Topic		Replies	Views
When doing "rasa init", why does the config.yml file have two "CountVectorFeaturizer"? Rasa Open Source	2	564	September 8, 2021
Why there are more than one featurizer in the nlu pipeline config? Rasa Open Source	1	307	September 5, 2021
Can we use both word and character in word count featurizer in rasa Rasa Open Source	3	549	October 6, 2021
Valid Custom Pipeline? Rasa Open Source	3	568	February 21, 2020
Two featurizer in rasa nlu config file [Deprecated] Rasa X Community Edition	8	1424	October 12, 2020

Rasa Pipeline Doubt

Configuration for Rasa NLU.

Components

Related topics