Getting stuck with "ValueError: Input contains NaN, infinity or a value too large for dtype('float64')."

Hey everyone, working my way through the masterclass. Getting very stuck on this error when trying to train the NLU component.

(venv) MacBook-Pro-8:Rasa Freezersting$ rasa train nlu
Training NLU model...
/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/nlu/config.py:50: FutureWarning: You are using a pipeline template. All pipelines templates are deprecated and will be removed in version 2.0. Please add the components you want to use directly to your configuration file.
  return RasaNLUModelConfig(config)
2020-06-24 13:37:09 INFO     rasa.nlu.utils.spacy_utils  - Trying to load spacy model with name 'en'
2020-06-24 13:37:10 INFO     rasa.nlu.components  - Added 'SpacyNLP' to component cache. Key 'SpacyNLP-en'.
2020-06-24 13:37:10 INFO     rasa.nlu.training_data.training_data  - Training data stats:
2020-06-24 13:37:10 INFO     rasa.nlu.training_data.training_data  - Number of intent examples: 37 (7 distinct intents)
2020-06-24 13:37:10 INFO     rasa.nlu.training_data.training_data  -   Found intents: 'affirm', 'greet', 'deny', 'goodbye', 'inform', 'search_provider', 'bot_challenge'
2020-06-24 13:37:10 INFO     rasa.nlu.training_data.training_data  - Number of response examples: 0 (0 distinct responses)
2020-06-24 13:37:10 INFO     rasa.nlu.training_data.training_data  - Number of entity examples: 12 (3 distinct entities)
2020-06-24 13:37:10 INFO     rasa.nlu.training_data.training_data  -   Found entity types: '(facility_type', 'facility_type', 'location'
/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Entity entity '(facility_type' has only 1 training examples! The minimum is 2, because of this the training may fail.
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Starting to train component SpacyNLP
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Finished training component.
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Starting to train component SpacyTokenizer
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Finished training component.
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Starting to train component SpacyFeaturizer
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Finished training component.
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Starting to train component RegexFeaturizer
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Finished training component.
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Starting to train component CRFEntityExtractor
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Finished training component.
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Starting to train component EntitySynonymMapper
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Finished training component.
2020-06-24 13:37:10 INFO     rasa.nlu.model  - Starting to train component SklearnIntentClassifier
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.0s finished
Traceback (most recent call last):
  File "/Users/Freezersting/Rasa/venv/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/__main__.py", line 92, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/cli/train.py", line 140, in train_nlu
    persist_nlu_training_data=args.persist_nlu_data,
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/train.py", line 414, in train_nlu
    persist_nlu_training_data,
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/train.py", line 453, in _train_nlu_async
    persist_nlu_training_data=persist_nlu_training_data,
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/train.py", line 482, in _train_nlu_with_validated_data
    persist_nlu_training_data=persist_nlu_training_data,
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/nlu/train.py", line 90, in train
    interpreter = trainer.train(training_data, **kwargs)
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/nlu/model.py", line 191, in train
    updates = component.train(working_data, self.config, **context)
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/rasa/nlu/classifiers/sklearn_intent_classifier.py", line 125, in train
    self.clf.fit(X, y)
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/sklearn/model_selection/_search.py", line 739, in fit
    self.best_estimator_.fit(X, y, **fit_params)
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/sklearn/svm/_base.py", line 148, in fit
    accept_large_sparse=False)
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/sklearn/utils/validation.py", line 755, in check_X_y
    estimator=estimator)
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/sklearn/utils/validation.py", line 578, in check_array
    allow_nan=force_all_finite == 'allow-nan')
  File "/Users/Freezersting/Rasa/venv/lib/python3.7/site-packages/sklearn/utils/validation.py", line 60, in _assert_all_finite
    msg_dtype if msg_dtype is not None else X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Hi @ernestlimjw, which rasa version are you using while following along with the masterclass?

Hi, I have the same problem using “pretrained_embeddings_spacy” pipeline for train nlu model. I’m using the latest version of rasa (1.10.8) on a virtual environment with Python3.7.

Hi @cicciob95, this is the new spacy pipeline, try it out, I think it will help :slightly_smiling_face:

pipeline:
  - name: SpacyNLP
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
1 Like

Hello everyone, I keep getting that error too when trying to compare three different configs. The config that is not working is

language: en
pipeline:
- name: SpacyNLP
- name: SpacyTokenizer
- name: SpacyFeaturizer
- name: RegexFeaturizer
- name: CRFEntityExtractor
- name: EntitySynonymMapper
- name: SklearnIntentClassifier

rasa==1.10.8

sklearn==0.22.2.post1

I can’t change the config because I’m actually trying to estimate if a change on pipeline and on rasa version would be worth it.

May it be a problem with trainig format?

I tried this before with other training set and it went ok.

If anyone can help, I’ll be thankfull.