Hi all,
I have defined a pipeline in rasa 2.0.0rc4. All components seem to work, except CountVectorsFeaturizer (with words):
pipeline: - name: packages.LanguageDetection.LanguageDetection - name: HFTransformersNLP # Name of the language model to use model_name: "bert" # Pre-Trained weights to be loaded #model_weights: "nlpaueb/bert-base-greek-uncased-v1" model_weights: "bert-base-multilingual-uncased" cache_dir: packages/langdata alias: "embeddings" - name: LanguageModelTokenizer # Flag to check whether to split intents intent_tokenization_flag: False # Symbol on which intent should be split intent_split_symbol: "_" - name: LanguageModelFeaturizer alias: "lmf" - name: RegexFeaturizer # Text will be processed with case sensitive as default case_sensitive: True alias: "rf" - name: CountVectorsFeaturizer analyzer: "char_wb" min_ngram: 1 max_ngram: 4 use_lemma: False # Set the out-of-vocabulary token OOV_token: "_oov_" # Whether to use a shared vocab use_shared_vocab: False alias: "cvf_c" - name: RegexEntityExtractor **- name: CountVectorsFeaturizer** ** alias: "cvf_w"** - name: DIETClassifier epochs: 50 random_seed: 20212020 - name: EntitySynonymMapper - name: ResponseSelector epochs: 50 random_seed: 20212020 featurizers: ["cvf_w", "lmf"] - name: FallbackClassifier threshold: 0.4 ambiguity_threshold: 0.1
The error I get is:
Traceback (most recent call last): File "/home/pepper/.local/bin/rasa", line 8, in <module> sys.exit(main()) File "/home/pepper/.local/lib/python3.8/site-packages/rasa/__main__.py", line 116, in main cmdline_arguments.func(cmdline_arguments) File "/home/pepper/.local/lib/python3.8/site-packages/rasa/cli/train.py", line 81, in train return rasa.train( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/train.py", line 43, in train return rasa.utils.common.run_in_loop( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/utils/common.py", line 300, in run_in_loop result = loop.run_until_complete(f) File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete File "/home/pepper/.local/lib/python3.8/site-packages/rasa/train.py", line 102, in train_async return await _train_async_internal( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/train.py", line 198, in _train_async_internal await _do_training( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/train.py", line 256, in _do_training await _train_core_with_validated_data( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/train.py", line 403, in _train_core_with_validated_data await rasa.core.train( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/train.py", line 67, in train agent.train(training_data, **additional_arguments) File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/agent.py", line 723, in train self.policy_ensemble.train( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/policies/ensemble.py", line 188, in train policy.train( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/policies/ted_policy.py", line 331, in train tracker_state_features, label_ids = self.featurize_for_training( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/policies/policy.py", line 164, in featurize_for_training state_features, label_ids = self.featurizer.featurize_trackers( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/featurizers/tracker_featurizers.py", line 140, in featurize_trackers tracker_state_features = self._featurize_states(trackers_as_states, interpreter) File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/featurizers/tracker_featurizers.py", line 68, in _featurize_states return [ File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/featurizers/tracker_featurizers.py", line 69, in <listcomp> [ File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/featurizers/tracker_featurizers.py", line 70, in <listcomp> self.state_featurizer.encode_state(state, interpreter) File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/featurizers/single_state_featurizer.py", line 201, in encode_state self._extract_state_features(sub_state, interpreter, sparse=True) File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/featurizers/single_state_featurizer.py", line 169, in _extract_state_features parsed_message = interpreter.featurize_message(message) File "/home/pepper/.local/lib/python3.8/site-packages/rasa/core/interpreter.py", line 158, in featurize_message result = self.interpreter.featurize_message(message) File "/home/pepper/.local/lib/python3.8/site-packages/rasa/nlu/model.py", line 418, in featurize_message component.process(message, **self.context) File "/home/pepper/.local/lib/python3.8/site-packages/rasa/nlu/featurizers/sparse_featurizer/count_vectors_featurizer.py", line 561, in process sequence_features, sentence_features = self._create_features( File "/home/pepper/.local/lib/python3.8/site-packages/rasa/nlu/featurizers/sparse_featurizer/count_vectors_featurizer.py", line 438, in _create_features seq_vec = self.vectorizers[attribute].transform(tokens) File "/home/pepper/.local/lib/python3.8/site-packages/sklearn/feature_extraction/text.py", line 1247, in transform self._check_vocabulary() File "/home/pepper/.local/lib/python3.8/site-packages/sklearn/feature_extraction/text.py", line 467, in _check_vocabulary raise NotFittedError("Vocabulary not fitted or provided") sklearn.exceptions.NotFittedError: Vocabulary not fitted or provided [pepper@deepy airobots]$