Rasa X Docker Compose installation issue, "Failed to update model. The previous model will stay loaded instead."

Hello Community!

Till now I have deployed Rasa x over EC2 and uploaded model I trained on my local machine. After upload, I checked logs for rasa-worker. Here they are:

ubuntu@ip-xxx-xx-xx-xxx:/etc/rasa$ sudo  docker-compose logs rasa-worker
Attaching to rasa_rasa-worker_1
rasa-worker_1      | 2020-11-23 13:38:40 ERROR    rasa.core.agent  - Failed to update model. The previous model will stay loaded instead.
rasa-worker_1      | Traceback (most recent call last):
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 987, in _from_pretrained
rasa-worker_1      |     local_files_only=local_files_only,
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/transformers/file_utils.py", line 260, in cached_path
rasa-worker_1      |     local_files_only=local_files_only,
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/transformers/file_utils.py", line 362, in get_from_cache
rasa-worker_1      |     os.makedirs(cache_dir, exist_ok=True)
rasa-worker_1      |   File "/usr/local/lib/python3.7/os.py", line 213, in makedirs
rasa-worker_1      |     makedirs(head, exist_ok=exist_ok)
rasa-worker_1      |   File "/usr/local/lib/python3.7/os.py", line 213, in makedirs
rasa-worker_1      |     makedirs(head, exist_ok=exist_ok)
rasa-worker_1      |   File "/usr/local/lib/python3.7/os.py", line 223, in makedirs
rasa-worker_1      |     mkdir(name, mode)
rasa-worker_1      | PermissionError: [Errno 13] Permission denied: '/.cache'
rasa-worker_1      | 
rasa-worker_1      | During handling of the above exception, another exception occurred:
rasa-worker_1      | 
rasa-worker_1      | Traceback (most recent call last):
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/agent.py", line 158, in _update_model_from_server
rasa-worker_1      |     _load_and_set_updated_model(agent, model_directory, new_fingerprint)
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/agent.py", line 131, in _load_and_set_updated_model
rasa-worker_1      |     interpreter = _load_interpreter(agent, nlu_path)
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/agent.py", line 90, in _load_interpreter
rasa-worker_1      |     return rasa.core.interpreter.create_interpreter(nlu_path)
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/interpreter.py", line 33, in create_interpreter
rasa-worker_1      |     return RasaNLUInterpreter(model_directory=obj)
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/interpreter.py", line 127, in __init__
rasa-worker_1      |     self._load_interpreter()
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/interpreter.py", line 164, in _load_interpreter
rasa-worker_1      |     self.interpreter = Interpreter.load(self.model_directory)
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/nlu/model.py", line 320, in load
rasa-worker_1      |     return Interpreter.create(model_metadata, component_builder, skip_validation)
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/nlu/model.py", line 347, in create
rasa-worker_1      |     component_meta, model_metadata.model_dir, model_metadata, **context
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/nlu/components.py", line 790, in load_component
rasa-worker_1      |     component_meta, model_dir, model_metadata, cached_component, **context
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/nlu/registry.py", line 178, in load_component_by_meta
rasa-worker_1      |     component_meta, model_dir, metadata, cached_component, **kwargs
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/nlu/components.py", line 476, in load
rasa-worker_1      |     return cls(meta)
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/nlu/utils/hugging_face/hf_transformers.py", line 66, in __init__
rasa-worker_1      |     self._load_model_instance(skip_model_load)
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/rasa/nlu/utils/hugging_face/hf_transformers.py", line 116, in _load_model_instance
rasa-worker_1      |     self.model_weights, cache_dir=self.cache_dir
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 911, in from_pretrained
rasa-worker_1      |     return cls._from_pretrained(*inputs, **kwargs)
rasa-worker_1      |   File "/opt/venv/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 1004, in _from_pretrained
rasa-worker_1      |     raise EnvironmentError(msg)
rasa-worker_1      | OSError: Couldn't reach server at '{}' to download vocabulary files.

I am using HFTransformersNLP (huggingface Transformer) as my Language Model. Please help I am stuck.

Hi! A couple of questions:

What version of Rasa are you on?

What does your config file look like?

Version:

RASA_X_VERSION=0.33.2
RASA_VERSION=2.0.2
RASA_X_DEMO_VERSION=0.33.0

Config.yml:

version: "2.0"
language: en_core_web_md

pipeline:
  - name: HFTransformersNLP
    model_name: "bert"
    model_weights: "rasa/LaBSE"
    cache_dir: /tmp
  - name: "LanguageModelTokenizer"
    "intent_tokenization_flag": False
    "intent_split_symbol": "_"
  - name: LanguageModelFeaturizer
    model_name: "bert"
    model_weights: "rasa/LaBSE"
    cache_dir: /tmp
    alias: LMF
  - name: RegexFeaturizer
  - name: CountVectorsFeaturizer
    alias: CVF
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
    "use_shared_vocab": True
  - name: DIETClassifier
    batch_strategy: balanced
    intent_split_symbol: +
    intent_tokenization_flag: True
    epochs: 300
    batch_size: 50
  - name: CRFEntityExtractor
  - name: EntitySynonymMapper
  - name: ResponseSelector
    featurizers: {CVF, LMF}
    epochs: 300
    retrieval_intent: faq
  - name: ResponseSelector
    featurizers: {CVF, LMF}
    epochs: 300
    retrieval_intent: chitchat
  - name: FallbackClassifier
    threshold: 0.4
    ambiguity_threshold: 0.1


policies:
   - name: MemoizationPolicy
     max_history: 5
   - name: TEDPolicy
     max_history: 5
     epochs: 300
   - name: RulePolicy

I was able to solve this by doing this in ${RASA_HOME}:

mkdir cache
chmod -R 777 cache

added this to docker-compose.yml on module x-rasa-services: &default-rasa-service:

volumes:
    - ./cache:/app/cache_dir

Then I uploaded my model using rasa-x UI, but it is not showing up. Here are the logs for rasa-production:

Attaching to rasa_rasa-production_1
Downloading: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5.22M/5.22M [00:00<00:00, 5.22MB/s]
Downloading: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 112/112 [00:00<00:00, 80.2kB/s]
Downloading: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 277/277 [00:00<00:00, 212kB/s]
Downloading: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 654/654 [00:00<00:00, 430kB/s]
Downloading: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.88G/1.88G [00:55<00:00, 34.1MB/s]2020-11-24 10:56:21 ERROR    rasa.core.agent  - Failed to update model. The previous model will stay loaded instead.
rasa-production_1  | Traceback (most recent call last):
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/agent.py", line 154, in _update_model_from_server
rasa-production_1  |     model_server, agent.fingerprint, model_directory
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/agent.py", line 226, in _pull_model_and_fingerprint
rasa-production_1  |     rasa.utils.io.unarchive(await resp.read(), model_directory)
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 973, in read
rasa-production_1  |     self._body = await self.content.read()
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/streams.py", line 358, in read
rasa-production_1  |     block = await self.readany()
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/streams.py", line 380, in readany
rasa-production_1  |     await self._wait('readany')
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/streams.py", line 296, in _wait
rasa-production_1  |     await waiter
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/helpers.py", line 596, in __exit__
rasa-production_1  |     raise asyncio.TimeoutError from None
rasa-production_1  | concurrent.futures._base.TimeoutError
rasa-production_1  | 2020-11-24 11:01:32 ERROR    rasa.core.agent  - Failed to update model. The previous model will stay loaded instead.
rasa-production_1  | Traceback (most recent call last):
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/agent.py", line 154, in _update_model_from_server
rasa-production_1  |     model_server, agent.fingerprint, model_directory
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/agent.py", line 226, in _pull_model_and_fingerprint
rasa-production_1  |     rasa.utils.io.unarchive(await resp.read(), model_directory)
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 973, in read
rasa-production_1  |     self._body = await self.content.read()
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/streams.py", line 358, in read
rasa-production_1  |     block = await self.readany()
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/streams.py", line 380, in readany
rasa-production_1  |     await self._wait('readany')
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/streams.py", line 296, in _wait
rasa-production_1  |     await waiter
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/helpers.py", line 596, in __exit__
rasa-production_1  |     raise asyncio.TimeoutError from None
rasa-production_1  | concurrent.futures._base.TimeoutError
rasa-production_1  | /opt/venv/lib/python3.7/site-packages/rasa/shared/utils/io.py:93: UserWarning: No policy ensemble or domain set. Skipping action prediction and execution.
rasa-production_1  |   More info at https://rasa.com/docs/rasa/policies
rasa-production_1  | 2020-11-24 11:06:42 ERROR    rasa.core.agent  - Failed to update model. The previous model will stay loaded instead.
rasa-production_1  | Traceback (most recent call last):
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/agent.py", line 154, in _update_model_from_server
rasa-production_1  |     model_server, agent.fingerprint, model_directory
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/rasa/core/agent.py", line 226, in _pull_model_and_fingerprint
rasa-production_1  |     rasa.utils.io.unarchive(await resp.read(), model_directory)
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 973, in read
rasa-production_1  |     self._body = await self.content.read()
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/streams.py", line 358, in read
rasa-production_1  |     block = await self.readany()
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/streams.py", line 380, in readany
rasa-production_1  |     await self._wait('readany')
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/streams.py", line 296, in _wait
rasa-production_1  |     await waiter
rasa-production_1  |   File "/opt/venv/lib/python3.7/site-packages/aiohttp/helpers.py", line 596, in __exit__
rasa-production_1  |     raise asyncio.TimeoutError from None
rasa-production_1  | concurrent.futures._base.TimeoutError
rasa-production_1  | 2020-11-24 11:24:08 ERROR    pika.adapters.utils.io_services_utils  - Socket failed to connect: <socket.socket fd=22, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.18.0.8', 50926)>; error=111 (Connection refused)
rasa-production_1  | 2020-11-24 11:24:08 ERROR    pika.adapters.utils.connection_workflow  - TCP Connection attempt failed: ConnectionRefusedError(111, 'Connection refused'); dest=(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('172.18.0.2', 5672))
rasa-production_1  | 2020-11-24 11:24:08 ERROR    pika.adapters.utils.connection_workflow  - AMQPConnector - reporting failure: AMQPConnectorSocketConnectError: ConnectionRefusedError(111, 'Connection refused')
root@ip-172-31-46-58:/etc/rasa# ls models/
model_LMF_bert_v1.tar.gz
root@ip-172-31-46-58:/etc/rasa# cd models/
root@ip-172-31-46-58:/etc/rasa/models#

and:

rasa-x_1           | ERROR:pika.adapters.blocking_connection:Unexpected connection close detected: ConnectionClosedByBroker: (320) "CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'"
rasa-x_1           | ERROR:rasax.community.services.event_service:Caught an exception while consuming events. Will retry in 5 s.
rasa-x_1           | Traceback (most recent call last):
rasa-x_1           |   File "/usr/local/lib/python3.7/site-packages/rasax/community/services/event_service.py", line 1713, in continuously_consume
rasa-x_1           |     consumer.consume()
rasa-x_1           |   File "/usr/local/lib/python3.7/site-packages/rasax/community/services/event_consumers/pika_consumer.py", line 177, in consume
rasa-x_1           |     self.channel.start_consuming()
rasa-x_1           |   File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 1866, in start_consuming
rasa-x_1           |     self._process_data_events(time_limit=None)
rasa-x_1           |   File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 2027, in _process_data_events
rasa-x_1           |     self.connection.process_data_events(time_limit=time_limit)
rasa-x_1           |   File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 825, in process_data_events
rasa-x_1           |     self._flush_output(common_terminator)
rasa-x_1           |   File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 522, in _flush_output
rasa-x_1           |     raise self._closed_result.value.error
rasa-x_1           | pika.exceptions.ConnectionClosedByBroker: (320, "CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'")

A little update, the error came back

rasa-worker_1 | PermissionError: [Errno 13] Permission denied: '/.cache'

I just restarted the docker-compose

Can you try removing HFTransformersNLP from your pipeline? This probably won’t solve your problem, but may make it easier to debug. HFTransformersNLP has been deprecated, and you should be fine using LanguageModelFeaturizer in its place.

I understand Ma’am but the component LanguageModelTokenizer requires HFTransformersNLP to be placed before it in the pipeline. Should I remove LanguageModelTokenizer as well?

Sorry, you’re right, I’ll edit my response to include the full name and avoid confusion!

Also I realised that HFTransformersNLP was deprecated in 2.1.0, and you are on 2.0.2 so this should not affect you. However you may be running into some weird bug because LanguageModelFeaturizer does not expect a model argument in 2.0.2 (this comes with the deprecation).

Are you able to train this model locally using rasa train?

1 Like

Yes, I did train model locally with the same config.yml file.

However on my local machine, I have (versions, model been trained on):

rasa==2.0.6
rasa-sdk==2.0.0
rasa-x==0.33.2

no help?

Another question for you – when you temporarily solved the problem here, did it work? Or were you still getting some issues? I ask because I am trying to pinpoint whether you are having trouble connecting to s3 or whether this is down to issues with the cache folder

Yes, it did but it came back after I reran docker-compose. Moreover, the pika adapters issue persist.

Could you update your rasa and rasa-x versions? It’s possible this will be fixed with a change we made to pika in 2.1

Sure, I’ll do that.