Rasa X - Training Failed / Upload Failed

Hi @here

Suddenly I started getting the following error when trying to train my data:

Also I have tried to upload a model, but no success.

I can train it locally, using the latest github version, however not using Rasa X.

My logs folder (/etc/rasa/logs) is empty and I can’t identify what is happening.

I stopped and started the docker containers.

Would you know what is happening? And, if the case, how can I uninstall and install Rasa X?

Thanks in Advance

1 Like

Hi @ffernandomaximo, when you try to upload the model and it fails, what does docker logs show for the Rasa worker and Rasa X containers?

Hi @b-quachtran

I got it, there was an issue with my endpoint.yml file. It happened when tried to configure the tracker store. I’ve commented the code causing the problem and restarted the docker. All back to normal.

Thanks

hello @b-quachtran, am facing the same problem. I have ran docker logs on Rasa worker. Here is the output. Any advise will really go a long way.

Starting Rasa X in production mode… :rocket: 2021-01-25 11:20:20.983316: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.1’; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory 2021-01-25 11:20:20.989563: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2021-01-25 11:20:31.077207: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcuda.so.1’; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory 2021-01-25 11:20:31.077301: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303) 2021-01-25 11:20:31.077356: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (9577c230e29c): /proc/driver/nvidia/version does not exist Starting Rasa X in production mode… :rocket: Traceback (most recent call last): File “/opt/venv/bin/rasa”, line 8, in sys.exit(main()) File “/opt/venv/lib/python3.7/site-packages/rasa/main.py”, line 116, in main cmdline_arguments.func(cmdline_arguments) File “/opt/venv/lib/python3.7/site-packages/rasa/cli/x.py”, line 350, in rasa_x run_in_production(args) File “/opt/venv/lib/python3.7/site-packages/rasa/cli/x.py”, line 409, in run_in_production _rasa_service(args, endpoints, None, credentials_path) File “/opt/venv/lib/python3.7/site-packages/rasa/cli/x.py”, line 100, in _rasa_service ssl_password=args.ssl_password, File “/opt/venv/lib/python3.7/site-packages/rasa/core/run.py”, line 187, in serve_application conversation_id=conversation_id, File “/opt/venv/lib/python3.7/site-packages/rasa/core/run.py”, line 116, in configure_app channels.channel.register(input_channels, app, route=route) File “/opt/venv/lib/python3.7/site-packages/rasa/core/channels/channel.py”, line 92, in register app.blueprint(channel.blueprint(handler), url_prefix=p) File “/opt/venv/lib/python3.7/site-packages/rasa/core/channels/telegram.py”, line 187, in blueprint out_channel = self.get_output_channel() File “/opt/venv/lib/python3.7/site-packages/rasa/core/channels/telegram.py”, line 272, in get_output_channel channel.set_webhook(url=self.webhook_url) File “/opt/venv/lib/python3.7/site-packages/telebot/init.py”, line 252, in set_webhook return apihelper.set_webhook(self.token, url, certificate, max_connections, allowed_updates, ip_address, timeout) File “/opt/venv/lib/python3.7/site-packages/telebot/apihelper.py”, line 225, in set_webhook return _make_request(token, method_url, params=payload, files=files) File “/opt/venv/lib/python3.7/site-packages/telebot/apihelper.py”, line 113, in _make_request json_result = _check_result(method_name, result) File “/opt/venv/lib/python3.7/site-packages/telebot/apihelper.py”, line 140, in _check_result raise ApiTelegramException(method_name, result, result_json) telebot.apihelper.ApiTelegramException: A request to the Telegram API was unsuccessful. Error code: 429. Description: Too Many Requests: retry after 1 2021-01-25 11:20:39.765116: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.1’; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory 2021-01-25 11:20:39.765507: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2021-01-25 11:20:43.878468: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcuda.so.1’; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory 2021-01-25 11:20:43.878663: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303) 2021-01-25 11:20:43.878901: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (9577c230e29c): /proc/driver/nvidia/version does not exist

Hi,

I am facing the same issue, only thing is I have installed rasa-x using kubernetes.

Please help!!