Error while trying to compare pipelines

Hi,

I’m trying to run this command: rasa test nlu --config config.yml config2.yml --nlu data/nlu.md --runs 3 --percentages 0 25 50 70 90 in order to compare the two pipelines, but get this error:

2019-11-28 16:09:52 INFO     rasa.nlu.components  - Added 'SpacyNLP' to component cache. Key 'SpacyNLP-en'.
2019-11-28 16:09:52 INFO     rasa.nlu.test  - Running model for predictions:
  0%|                                                                                            | 0/9 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "c:\users\tizianolabruna\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\users\tizianolabruna\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\TizianoLabruna\Anaconda3\Scripts\rasa.exe\__main__.py", line 9, in <module>
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\__main__.py", line 76, in main
    cmdline_arguments.func(cmdline_arguments)
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\cli\test.py", line 136, in test_nlu
    exclusion_percentages=args.percentages,
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\test.py", line 179, in compare_nlu_models
    runs,
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\nlu\test.py", line 1325, in compare_nlu
    test_path, model_path, output_directory=output_path, errors=True
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\nlu\test.py", line 1034, in run_evaluation
    interpreter, test_data
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\nlu\test.py", line 889, in get_eval_data
    result = interpreter.parse(example.text, only_output_properties=False)
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\nlu\model.py", line 380, in parse
    component.process(message, **self.context)
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\nlu\classifiers\sklearn_intent_classifier.py", line 148, in process
    intent_ids, probabilities = self.predict(X)
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\nlu\classifiers\sklearn_intent_classifier.py", line 191, in predict
    pred_result = self.predict_prob(X)
  File "C:\Users\TizianoLabruna\AppData\Roaming\Python\Python37\site-packages\rasa\nlu\classifiers\sklearn_intent_classifier.py", line 180, in predict_prob
    return self.clf.predict_proba(X)
  File "c:\users\tizianolabruna\anaconda3\lib\site-packages\sklearn\svm\base.py", line 622, in _predict_proba
    X = self._validate_for_predict(X)
  File "c:\users\tizianolabruna\anaconda3\lib\site-packages\sklearn\svm\base.py", line 478, in _validate_for_predict
    (n_features, self.shape_fit_[1]))
ValueError: X.shape[1] = 256 should be equal to 128, the number of features at training time

Anybody knows how to help me?

Thank you, Tiziano

hi @tiziano - I’ve seen this before with the spaCy component but can’t quite remember what caused it. What does your config file look like ?

Hi @amn41, thank you for replying. My config files are the following:

config.yml

language: "en"

pipeline: 
- name: "WhitespaceTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"

policies:
- name: MemoizationPolicy
- name: MappingPolicy
- name: EmbeddingPolicy
  max_history: 10
  batch_strategy: balanced
  epochs: 5
  random_seed: 1234
  evaluate_on_num_examples: 0

config2.yml

language: "en"

pipeline: pretrained_embeddings_spacy

policies:
- name: MemoizationPolicy
- name: MappingPolicy
- name: EmbeddingPolicy
  max_history: 10
  batch_strategy: balanced
  epochs: 5
  random_seed: 1234
  evaluate_on_num_examples: 0

do the models train correctly if you train them separately?