Hi,
I’m using rasa 1.4.0
and I’m getting a few CountVectorizer warnings. Can you help me understand these issues? What is the impact on the model due to this?
2019-11-09 10:38:59 WARNING rasa.nlu.featurizers.count_vectors_featurizer - Unable to train CountVectorizer for message attribute text. Leaving an untrained CountVectorizer for it
2019-11-09 10:38:59 WARNING rasa.nlu.featurizers.count_vectors_featurizer - Unable to train CountVectorizer for message attribute intent. Leaving an untrained CountVectorizer for it
2019-11-09 10:38:59 DEBUG rasa.nlu.featurizers.count_vectors_featurizer - No text provided for response attribute in any messages of training data. Skipping training a CountVectorizer for it.
This is my config.yml
language: en
pipeline:
- name: WhitespaceTokenizer
- name: CRFEntityExtractor
- name: EntitySynonymMapper
- name: CountVectorsFeaturizer
stop_words: {'english'}
analyzer: word
token_pattern: r'(?u)\b\w\w+\b'
lowercase: true
max_ngram: 5
min_ngram: 1
OOV_token: '__oov__'
OOV_words: ['Singapore', 'Australia', '2019', '2020']
- name: CountVectorsFeaturizer
analyzer: char_wb
lowercase: true
max_ngram: 5
min_ngram: 3
- name: EmbeddingIntentClassifier
random_seed: 12345
intent_split_symbol: +
intent_tokenization_flag: true
- name: DucklingHTTPExtractor
url: http://localhost:8000
dimensions:
- time
- number
- amount-of-money
- distance
locale: en_GB
timezone: US/Eastern
timeout: 20
policies:
- name: KerasPolicy
rnn_size: 32
epochs: 150
batch_size: 32
validation_split: 0.1
max_history: 10
random_seed: 12345
- name: FormPolicy
- name: AugmentedMemoizationPolicy
max_history: 6
- name: MappingPolicy
- name: TwoStageFallbackPolicy
core_threshold: 0.3
nlu_threshold: 0.9
ambiguity_threshold: 0.1
fallback_core_action_name: action_default_fallback
fallback_nlu_action_name: action_default_ask_affirmation
deny_suggestion_intent_name: out_of_scope