Hello.
We are trying out rasa and while we got everything up and running, we find that Rasa NLU server dies a lot and is generally not stable. Core is a little better. As an example last error that occurred after a training job is below. Training was requested through the API and 8 mins later the follow error in log reflected. 2019-01-31 19:43:04-0500 [-] 2019-01-31 19:43:04 WARNING rasa_nlu.data_router - [Failure instance: Traceback (failure with no frames): <class ‘twisted.internet.defer.CancelledError’>: 2019-01-31 19:43:04-0500 [-] ] 2019-01-31 19:43:04-0500 [_GenericHTTPChannelProtocol,2773,174.89.234.171] Unhandled Error Processing Request. Traceback (most recent call last): File “/usr/local/lib/python3.5/dist-packages/Twisted-18.9.0-py3.5-linux-x86_64.egg/twisted/internet/defer.py”, line 501, in errback self._startRunCallbacks(fail) File “/usr/local/lib/python3.5/dist-packages/Twisted-18.9.0-py3.5-linux-x86_64.egg/twisted/internet/defer.py”, line 568, in _startRunCallbacks self._runCallbacks() File “/usr/local/lib/python3.5/dist-packages/Twisted-18.9.0-py3.5-linux-x86_64.egg/twisted/internet/defer.py”, line 654, in _runCallbacks current.result = callback(current.result, *args, **kw) File “/usr/local/lib/python3.5/dist-packages/Twisted-18.9.0-py3.5-linux-x86_64.egg/twisted/internet/defer.py”, line 1475, in gotResult _inlineCallbacks(r, g, status) — — File “/usr/local/lib/python3.5/dist-packages/Twisted-18.9.0-py3.5-linux-x86_64.egg/twisted/internet/defer.py”, line 1416, in _inlineCallbacks result = result.throwExceptionIntoGenerator(g) File “/usr/local/lib/python3.5/dist-packages/Twisted-18.9.0-py3.5-linux-x86_64.egg/twisted/python/failure.py”, line 491, in throwExceptionIntoGenerator return g.throw(self.type, self.value, self.tb) File “/home/ubuntu/.local/lib/python3.5/site-packages/rasa_nlu/server.py”, line 362, in train RasaNLUModelConfig(model_config), model_name) File “/usr/local/lib/python3.5/dist-packages/Twisted-18.9.0-py3.5-linux-x86_64.egg/twisted/internet/defer.py”, line 654, in _runCallbacks current.result = callback(current.result, *args, **kw) File “/home/ubuntu/.local/lib/python3.5/site-packages/rasa_nlu/data_router.py”, line 356, in training_errback failure.value.failed_target_project) builtins.AttributeError: ‘CancelledError’ object has no attribute ‘failed_target_project’
OS: Ubuntu 16.04 Python 3.5 Rasa NLU 0.13.8 Pipeline: TensorFlow
It seems that training jobs running was showing as 1 and max training allowed was also 1. So as a result no more training could be done until NLU process was killed and restarted.
Is there a better level of logging that’s available and what does the above error mean? Why did it kill the whole process?
Thank you