Rasa-x GUI reports 'Model build failed', but after 5 mins model is available on the list

Hello,

We are using latest Helm chart on EKS (from docs), docker images:

  • rasa-x tag 0.32.2 (also used latest from today)
  • rasa rag: 1.10.14-full

The issue is, that when we click Train, after around 30s we got ‘Model train failed’, but after exactly 5 mins since the moment training has started, when we refresh the rasa-x-url/models, we have this model on the list. Is there any specific timeout we can setup to make Rasa-x wait for rasa worker to finish training the model? Or is this any rasa-x bug?

Thank you in advance!

1 Like

Hi @filiphaftek, would you mind sharing the container logs from the moment you start the training process?

Hi @ricwo,

Yes, here are the logs from Rasa-x container:

  ERROR:rasax.community.api.blueprints.models:400, message='Bad Request', url=URL('http://rasa-rasa-x-rasa-worker:5005/model/train?token=rasaToken')
  Encountered an exception while training. Please check the logs of the rasa worker container for more information. 

In rasa worker, we see only warnings when the container is started:

    /opt/venv/lib/python3.7/site-packages/rasa/nlu/config.py:50: FutureWarning: You are using a pipeline template. All pipelines templates are deprecated and will be removed in version 2.0. Please add the components you want to use directly to your configuration file.
      return RasaNLUModelConfig(config)
    /opt/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Please configure the number of 'epochs' in your configuration file. We will change the default value of 'epochs' in the future to 1.
    /opt/venv/lib/python3.7/site-packages/rasa/nlu/components.py:489: FutureWarning: 'EmbeddingIntentClassifier' is deprecated and will be removed in version 2.0. Use 'DIETClassifier' instead.
      return cls(component_config)
    /opt/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Intent 'what_can_i_find_on_quoka' has only 1 training examples! Minimum is 2, training may fail.
    /opt/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Intent 'how_to_find_something_on_quoka' has only 1 training examples! Minimum is 2, training may fail.
    /opt/venv/lib/python3.7/site-packages/rasa/utils/common.py:363: UserWarning: Misaligned entity annotation in message 'Guten Tag, mein Name ist DanielDaniel Weber' with intent 'introduce'. Make sure the start and end values of entities in the training data match the token boundaries (e.g. entities don't include trailing whitespaces or punctuation).
          More info at https://rasa.com/docs/rasa/nlu/training-data-format/
        /opt/venv/lib/python3.7/site-packages/rasa/utils/tensorflow/model_data.py:386: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
          final_data[k].append(np.concatenate(np.array(v)))
        /opt/venv/lib/python3.7/site-packages/rasa/utils/tensorflow/model_data.py:386: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
          final_data[k].append(np.concatenate(np.array(v)))
        /opt/venv/lib/python3.7/site-packages/rasa/utils/tensorflow/model_data.py:386: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
          final_data[k].append(np.concatenate(np.array(v)))

but during the training, there is no new log entries there.

If you need any more information, please let me know. Thank you.

Thanks! Does this happen every time you train or only sometimes?

Could you post the response you get back from the training request after 30 seconds from your browser’s dev console’s “network” tab? It should have status 400 but there should be a JSON payload attached to it, which would be very helpful to see. Thanks!