Rasa deployment - Language Model Featurizer download stucks

paolo_1st · June 28, 2021, 11:50am

Hi,

The deployment of Rasa X on server is ready, and as per logs of the crashed pods, it’s hanging on the download of 1.88G transformers file (keep repeating the download). I tried to add download of rasa[spacy], Spacy and transformers packages under Docker file, without success. Please help.

[root@localhost ~]# kubectl get pods --namespace rasa
NAME                                   READY   STATUS             RESTARTS   AGE
rasa-nginx-55d4f96567-d6t4t            1/1     Running            0          64m
rasa-redis-master-0                    1/1     Running            0          64m
rasa-postgresql-0                      1/1     Running            0          64m
rasa-rasa-x-648fd4dd96-svgbm           1/1     Running            0          64m
rasa-rabbit-0                          1/1     Running            0          64m
rasa-db-migration-service-0            1/1     Running            1          64m
rasa-app-889bfd8bf-8tmtg               1/1     Running            0          64m
rasa-event-service-7d57fc658b-5mm2f    1/1     Running            0          64m
rasa-rasa-production-96775d488-9r76v   0/1     CrashLoopBackOff   14         64m
rasa-rasa-worker-5cdfbb8c85-vh9lw      0/1     CrashLoopBackOff   14         64m
[root@localhost ~]# kubectl logs --namespace=rasa rasa-rasa-worker-5cdfbb8c85-vh9lw
Downloading: 100%|██████████| 5.22M/5.22M [00:01<00:00, 4.40MB/s]
Downloading: 100%|██████████| 112/112 [00:00<00:00, 84.7kB/s]
Downloading: 100%|██████████| 277/277 [00:00<00:00, 178kB/s]
Downloading: 100%|██████████| 654/654 [00:00<00:00, 448kB/s]
Downloading:  88%|████████▊ | 1.66G/1.88G [01:55<00:15, 14.0MB/s][root@localhost ~]# kubectl logs --namespace=rasa rasa-rasalogs --namespace=rasa rasa-rasa-production-96775d488-9r76v
Downloading: 100%|██████████| 5.22M/5.22M [00:01<00:00, 4.40MB/s]
Downloading: 100%|██████████| 112/112 [00:00<00:00, 72.3kB/s]
Downloading: 100%|██████████| 277/277 [00:00<00:00, 141kB/s]
Downloading: 100%|██████████| 654/654 [00:00<00:00, 439kB/s]
Downloading:  77%|███████▋  | 1.44G/1.88G [01:50<00:29, 15.1MB/s][root@localhost ~]#

fkoerner · July 2, 2021, 1:48pm

Could you share your config? This looks like a LanguageModelFeaturizer download is getting stuck. You should be able to place it in a cache_dir, which can be specified as a parameter to LanguageModelFeaturizer. Note you’ll need to will need to name it appropriately for it to be recognized as the correct cache file

paolo_1st · July 5, 2021, 5:59am

Hi Felicia,

First of all, thanks for ur support.

Yes, it’s Language Model Featurizer. As you can see below, we had to disable it in order to let helm setup work. What we can do to get rid of the stucking download?

‘’’’ language: en_core_web_md pipeline:

name: SpacyNLP model: “en_core_web_md”
name: SpacyTokenizer #- name: LanguageModelFeaturizer

model_name: “bert”

model_weights: “rasa/LaBSE”

cache_dir: ./.cache

name: SpacyFeaturizer
name: LexicalSyntacticFeaturizer
name: RegexFeaturizer case_sensitive: false
name: LexicalSyntacticFeaturizer
name: CountVectorsFeaturizer
name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
name: DIETClassifier epochs: 141 model_confidence: linear_norm loss_type: cross_entropy constrain_similarities: true number_of_transformer_layers: 2 number_of_attention_heads: 4 batch_size:
- 64
- 128 evaluate_on_number_of_examples: 250 evaluate_every_number_of_epochs: 5 regularization_constant: 0.002 random_seed: 1 tensorboard_log_directory: ./.tensorboard tensorboard_log_level: epoch
name: RegexEntityExtractor case_sensitive: false use_lookup_tables: true
name: EntitySynonymMapper
name: ResponseSelector epochs: 26 model_confidence: linear_norm loss_type: cross_entropy constrain_similarities: true regularization_constant: 0.002 random_seed: 1 batch_size:
- 64
- 128 evaluate_on_number_of_examples: 5 evaluate_every_number_of_epochs: 1 tensorboard_log_directory: ./.tensorboard tensorboard_log_level: epoch
name: FallbackClassifier threshold: 0.2 ambiguity_threshold: 0.05

policies:

name: AugmentedMemoizationPolicy max_history: 8
name: TEDPolicy max_history: 8 epochs: 41 model_confidence: linear_norm loss_type: cross_entropy constrain_similarities: true regularization_constant: 0.002 random_seed: 1 batch_size:
- 64
- 128 evaluate_on_number_of_examples: 200 evaluate_every_number_of_epochs: 5 tensorboard_log_directory: ./.tensorboard tensorboard_log_level: epoch
name: RulePolicy core_fallback_threshold: 0.4 core_fallback_action_name: “action_default_fallback” enable_fallback_prediction: true restrict_rules: true check_for_contradictions: true ‘’’’

fkoerner · July 5, 2021, 6:52am

Hi Paul, you can download the necessary files ahead of time and then transfer them to the server. You should need:

vocab.txt
special_tokens_map.json
tokenizer_config.json
config.json
tf_model.h5 (this one is the largest, and probably where your download gets stuck)

Easiest way to do it and ensure the files will be named correctly according to Hugging Face transformers is to run a script like below on a computer with a good internet connection. The files you need will be in the CACHE_DIR, and then you can transfer these to the server.

from rasa.nlu.utils.hugging_face.registry import model_tokenizer_dict, model_class_dict

MODEL_WEIGHTS = "rasa/LaBSE"
CACHE_DIR = "./cache"
MODEL_NAME = "bert"

model_tokenizer_dict[MODEL_NAME].from_pretrained(MODEL_WEIGHTS, cache_dir=CACHE_DIR)
model_class_dict[MODEL_NAME].from_pretrained(MODEL_WEIGHTS, cache_dir=CACHE_DIR)

paolo_1st · July 5, 2021, 8:34am

@fkoerner ok thanks i will try that, but i have the following questions regarding quick install deployment:

Does the above cache folder exist on Rasa server, or i need to create it (if yes, in same folder of Rasa code)?
On each new deployment, i need to delete the namespace rasa which recreates all pods including the pod of Postgesql (rasa-postgresql-0) for storing the conversations. How can i save conversations to external database on the host of the rasa server (not pod)? if not possible, how to exclude the pod of Postgresql from reinstallation (so i dont lose the existing conversations), and make this pod accessible from another server?

paolo_1st · July 5, 2021, 12:16pm

regarding the language model featurizer, i followed ur instructions without success:

before ur instructions: i already copied the content of cache folder from Rasa local to server (no success).
today, i create the python script as requested above on Rasa server, and it download the same files as on Rasa local earlier:

‘’’’ [root@localhost fyp-chatbot]# cd cache [root@localhost cache]# ls -l total 1844948 -rw-r–r–. 1 root root 277 Jul 5 09:36 553c41c20e6ffffe3b7c73480347e391fb7f4d59493a65c12d04b445170a3bf9.6cfa3b45eeb6a7ae9e91c7a6b07a42b8fa10bab3c158b3242a07e871849d777c -rw-r–r–. 1 root root 135 Jul 5 09:36 553c41c20e6ffffe3b7c73480347e391fb7f4d59493a65c12d04b445170a3bf9.6cfa3b45eeb6a7ae9e91c7a6b07a42b8fa10bab3c158b3242a07e871849d777c.json -rwxr-xr-x. 1 root root 0 Jul 5 09:36 553c41c20e6ffffe3b7c73480347e391fb7f4d59493a65c12d04b445170a3bf9.6cfa3b45eeb6a7ae9e91c7a6b07a42b8fa10bab3c158b3242a07e871849d777c.lock -rw-r–r–. 1 root root 5220781 Jul 5 09:36 7920887359c7c90d8bdbcdfd6c8efa43798ece594159b6af9d69c37a7631c645.8f2ffe7514c779e620b40da312123fd8536e25273a5873d73b975930ff3f3def -rw-r–r–. 1 root root 123 Jul 5 09:36 7920887359c7c90d8bdbcdfd6c8efa43798ece594159b6af9d69c37a7631c645.8f2ffe7514c779e620b40da312123fd8536e25273a5873d73b975930ff3f3def.json -rwxr-xr-x. 1 root root 0 Jul 5 09:36 7920887359c7c90d8bdbcdfd6c8efa43798ece594159b6af9d69c37a7631c645.8f2ffe7514c779e620b40da312123fd8536e25273a5873d73b975930ff3f3def.lock -rw-r–r–. 1 root root 112 Jul 5 09:36 a4c3109a4dda2b184cfc58e2f8770902c8892850b9a1d1ecfa3b975865ef2849.dd8bd9bfd3664b530ea4e645105f557769387b3da9f79bdb55ed556bdd80611d -rw-r–r–. 1 root root 137 Jul 5 09:36 a4c3109a4dda2b184cfc58e2f8770902c8892850b9a1d1ecfa3b975865ef2849.dd8bd9bfd3664b530ea4e645105f557769387b3da9f79bdb55ed556bdd80611d.json -rwxr-xr-x. 1 root root 0 Jul 5 09:36 a4c3109a4dda2b184cfc58e2f8770902c8892850b9a1d1ecfa3b975865ef2849.dd8bd9bfd3664b530ea4e645105f557769387b3da9f79bdb55ed556bdd80611d.lock -rw-r–r–. 1 root root 1883969304 Jul 5 09:38 b60f119db02852a2a5fb7bee76b15b5937e623ef9b243df1c6de8c7cb976dd94.f0e007221889d8582aa9b987b273083365e918107c3546a4f7db5969110c7f27.h5 -rw-r–r–. 1 root root 149 Jul 5 09:38 b60f119db02852a2a5fb7bee76b15b5937e623ef9b243df1c6de8c7cb976dd94.f0e007221889d8582aa9b987b273083365e918107c3546a4f7db5969110c7f27.h5.json -rwxr-xr-x. 1 root root 0 Jul 5 09:36 b60f119db02852a2a5fb7bee76b15b5937e623ef9b243df1c6de8c7cb976dd94.f0e007221889d8582aa9b987b273083365e918107c3546a4f7db5969110c7f27.h5.lock -rw-r–r–. 1 root root 654 Jul 5 09:36 df5b6caa65edee28920aa5624d51d8fc9951f0caa99b1de6ae73561b78cf1085.98ce2c5338a6edc7cead8cfe5c694c511723714cf92cb6b45782d869e645c6f7 -rw-r–r–. 1 root root 125 Jul 5 09:36 df5b6caa65edee28920aa5624d51d8fc9951f0caa99b1de6ae73561b78cf1085.98ce2c5338a6edc7cead8cfe5c694c511723714cf92cb6b45782d869e645c6f7.json -rwxr-xr-x. 1 root root 0 Jul 5 09:36 df5b6caa65edee28920aa5624d51d8fc9951f0caa99b1de6ae73561b78cf1085.98ce2c5338a6edc7cead8cfe5c694c511723714cf92cb6b45782d869e645c6f7.lock ‘’’’’

paolo_1st · July 5, 2021, 12:30pm

regarding the rasa deployment, in order to not lose the existing conversation on each deployment (postgres pod recreation), if i did the following:

1)- Disable the installation of Postgresql pod under values.yml: global: postgresql: install: “false”

2)- I have an existing database under Rasa local server containing the events table (Rasa x), and other custom. How to force the deployment to use this DB on Rasa local (other machine)? I appreciate if you can provide config sample please.

paolo_1st · July 13, 2021, 8:47am

please can anyone help me?

fkoerner · July 16, 2021, 12:57pm

@paolo_1st please excuse the delay in my response – I was out.

To clarify: it seems like you are asking two different questions, one about the language model featurizer files and one about postgresql?

What issues did you have with the downloaded language model featurizer files? Did you get an error message or similar? Did you create a cache folder as specified by the parameter CACHE_DIR and move the files in there?

Could you explain the postgresql issue further? Are you asking how to use the database of a local server?

fkoerner · July 16, 2021, 1:17pm

Also, in the future it is best to create a new topic for an unrelated question! This helps keep things organised and makes it easier for us to route the question the the correct person.

paolo_1st · July 16, 2021, 5:15pm

@fkoerner don’t worry. I didn’t pay attention for the two different topics. Next time, i will separate.

1)- regarding the language model, i already tried two approaches without success:

Using GIT, i included the cache directory within the pull/push from Rasa local to server. Although the cache directory was there on Rasa server, it started the download and kept hanging.
i faced the same problem when i downloaded the files on Rasa local using the script you provided, and moved these files to the cache dir of Rasa server.

2)- Regarding postgresql, Using helm chart and the below values file, i don’t want to create postgresql pod (install: false, existinghost…) and to use always the same password (for postgres user as i did below), but it’s generating always a random password. Note that i installed postgresql server on the host itself hosting rasa server, to be the database of Rasa (so i don’t need to import/export old conversations on each deployment).

‘’’’ rasa: tag: “2.7.1-full” additionalChannelCredentials: rest: rasax: tag: “0.41.1” initialUser: password: “IP@eco@2021” global: postgresql: postgresqlPassword: “a3cf4g8mh88n14ewdf4gfdd” networkPolicy: enabled: false allowExternal: true debugMode: false ‘’’’

desmarchris · July 22, 2021, 2:02pm

hey @paolo_1st,

For 2: can you try with install: false rather than enabled: false? rasa-x-helm/values.yaml at main · RasaHQ/rasa-x-helm · GitHub

Also, I can’t tell from the formatting above but I think you need to move your postgres definition. There are two spots you need to configure: at the top level where you set the existing host and install to false, and inside global where you define the credentials.

fkoerner · July 22, 2021, 3:18pm

@paolo_1st for 1: did you change the names of the files? did you specify the name of the directory in your config?

paolo_1st · July 23, 2021, 5:50am

@fkoerner thanks for your assistance.

1)- Either downloading the L.M.F on rasa local, or the script you provided, i m getting the same result (random files as u can see below), i don’t which file belong to which file above (to rename as u requested above):

Blockquote waf@waf-AHV:~/Desktop/fyp-chatbot/.cache$ ls -l total 1844952 -rwxrwxrwx 1 waf waf 5220781 حزيران 30 15:32 21aa38329c730774d9f45df9ec5443a9bd4abd2191e1d510c27647c151c5437f.f2539f82b1008971c6ea6574f078d95c6eead57223fc74fdc420013fa9de391a -rwxrwxrwx 1 waf waf 131 حزيران 30 15:32 21aa38329c730774d9f45df9ec5443a9bd4abd2191e1d510c27647c151c5437f.f2539f82b1008971c6ea6574f078d95c6eead57223fc74fdc420013fa9de391a.json -rwxrwxrwx 1 waf waf 0 حزيران 30 15:30 21aa38329c730774d9f45df9ec5443a9bd4abd2191e1d510c27647c151c5437f.f2539f82b1008971c6ea6574f078d95c6eead57223fc74fdc420013fa9de391a.lock -rwxrwxrwx 1 waf waf 277 حزيران 30 15:32 527f618330e845c9d31826e7d9ce983aa816fafcf4f29f8c52f8ae1fdd097219.1c61d5d3dc67d88e0c74c64cda9b17bc30bdbd1c373cceeb740b9953729709aa -rwxrwxrwx 1 waf waf 143 حزيران 30 15:32 527f618330e845c9d31826e7d9ce983aa816fafcf4f29f8c52f8ae1fdd097219.1c61d5d3dc67d88e0c74c64cda9b17bc30bdbd1c373cceeb740b9953729709aa.json -rwxrwxrwx 1 waf waf 0 حزيران 30 15:32 527f618330e845c9d31826e7d9ce983aa816fafcf4f29f8c52f8ae1fdd097219.1c61d5d3dc67d88e0c74c64cda9b17bc30bdbd1c373cceeb740b9953729709aa.lock -rwxrwxrwx 1 waf waf 654 حزيران 30 15:32 90984a8da5021905af8679644b61bc5428ef16e9a307469152c163ec873db240.f1ba7080a92fc164a144311742f36dfb6a724bc9da532264b30d87040e15cc9d -rwxrwxrwx 1 waf waf 133 حزيران 30 15:32 90984a8da5021905af8679644b61bc5428ef16e9a307469152c163ec873db240.f1ba7080a92fc164a144311742f36dfb6a724bc9da532264b30d87040e15cc9d.json -rwxrwxrwx 1 waf waf 0 حزيران 30 15:32 90984a8da5021905af8679644b61bc5428ef16e9a307469152c163ec873db240.f1ba7080a92fc164a144311742f36dfb6a724bc9da532264b30d87040e15cc9d.lock -rwxrwxrwx 1 waf waf 112 حزيران 30 15:32 99497d78492c90ab7d824d695b9a8d043369fbc2bf1112dcc7cdef9a6c4fa691.275045728fbf41c11d3dae08b8742c054377e18d92cc7b72b6351152a99b64e4 -rwxrwxrwx 1 waf waf 145 حزيران 30 15:32 99497d78492c90ab7d824d695b9a8d043369fbc2bf1112dcc7cdef9a6c4fa691.275045728fbf41c11d3dae08b8742c054377e18d92cc7b72b6351152a99b64e4.json -rwxrwxrwx 1 waf waf 0 حزيران 30 15:32 99497d78492c90ab7d824d695b9a8d043369fbc2bf1112dcc7cdef9a6c4fa691.275045728fbf41c11d3dae08b8742c054377e18d92cc7b72b6351152a99b64e4.lock -rwxrwxrwx 1 waf waf 1883969304 حزيران 30 17:15 fd2ff7409cd4abbce31d54b8acebc305939787751dd697b6f38a3bf1f197a614.2589e15ea34b96d9bdcc478748ae77b629487da363566089fe6a8cdb1e6ea284.h5 -rwxrwxrwx 1 waf waf 108 حزيران 30 17:15 fd2ff7409cd4abbce31d54b8acebc305939787751dd697b6f38a3bf1f197a614.2589e15ea34b96d9bdcc478748ae77b629487da363566089fe6a8cdb1e6ea284.h5.json -rwxrwxrwx 1 waf waf 0 حزيران 30 15:32 fd2ff7409cd4abbce31d54b8acebc305939787751dd697b6f38a3bf1f197a614.2589e15ea34b96d9bdcc478748ae77b629487da363566089fe6a8cdb1e6ea284.h5.lock

Blockquote

2)- regarding postgresql, the purpose is to prevent Kubernetes from installing postgresql pod, and use my current existing database (postgresql on the same server).

fkoerner · July 23, 2021, 11:48am

You shouldn’t rename those files – they are not random but hashes that transformers parses. Don’t rename them, just move them into the cache folder. Were they all downloaded successfully locally?

paolo_1st · July 23, 2021, 11:58am

Yes, I created cache folder under the root folder of rasa server (same structure as local rasa app),and moved the downloaded files there. It seems these are files are being ignored during the deployment. How can I force deployment to use cache folder during quick install deployment?

fkoerner · July 23, 2021, 12:02pm

And you’ve added the cache folder to your config? Can I see how you’ve done so?

paolo_1st · July 23, 2021, 1:19pm

Kindly find attached a sample of LMF config. Do i need to copy the cache folder also under Dockerfile using the command: COPY ./cache /app/cache ??

fkoerner · July 30, 2021, 10:17am

Hi, you need:

the files with their original names (so this cache-like name) in a folder on the server
point to this folder with the cache_dir argument

If you have this – is it possible that the cache folder isn’t being picked up? Could you set logging to info? You should be able to see messages from transformers about whether the cached files were found (either loading file {} from cache at... or loading file {} from ...

Have you tried this locally? Does it only fail during deployment?

fkoerner · July 30, 2021, 10:21am

How did you copy the cache folder to the docker container?

Topic		Replies	Views
Open Source Deployment On Kubernetes Deploys Rasa-X Rasa Open Source	6	1127	August 17, 2023
Rasa X Docker Compose installation issue, "Failed to update model. The previous model will stay loaded instead." [Deprecated] Rasa X Community Edition	12	1679	December 8, 2020
Training fails when using HFTransformersNLP Rasa X [Deprecated] Rasa X Community Edition	8	1302	November 30, 2020
Rasa X Not uploading model files [Deprecated] Rasa X Community Edition	20	1453	September 10, 2021
Training seems to finish properly, but there is no new model after 2 hours [Deprecated] Rasa X Community Edition	8	773	November 4, 2020

Rasa deployment - Language Model Featurizer download stucks

model_name: “bert”

model_weights: “rasa/LaBSE”

cache_dir: ./.cache

Related topics