The deployment of Rasa X on server is ready, and as per logs of the crashed pods, it’s hanging on the download of 1.88G transformers file (keep repeating the download). I tried to add download of rasa[spacy], Spacy and transformers packages under Docker file, without success. Please help.
Could you share your config? This looks like a LanguageModelFeaturizer download is getting stuck. You should be able to place it in a cache_dir, which can be specified as a parameter to LanguageModelFeaturizer. Note you’ll need to will need to name it appropriately for it to be recognized as the correct cache file
Yes, it’s Language Model Featurizer. As you can see below, we had to disable it in order to let helm setup work. What we can do to get rid of the stucking download?
Hi Paul, you can download the necessary files ahead of time and then transfer them to the server. You should need:
vocab.txt
special_tokens_map.json
tokenizer_config.json
config.json
tf_model.h5
(this one is the largest, and probably where your download gets stuck)
Easiest way to do it and ensure the files will be named correctly according to Hugging Face transformers is to run a script like below on a computer with a good internet connection. The files you need will be in the CACHE_DIR, and then you can transfer these to the server.
@fkoerner ok thanks i will try that, but i have the following questions regarding quick install deployment:
Does the above cache folder exist on Rasa server, or i need to create it (if yes, in same folder of Rasa code)?
On each new deployment, i need to delete the namespace rasa which recreates all pods including the pod of Postgesql (rasa-postgresql-0) for storing the conversations. How can i save conversations to external database on the host of the rasa server (not pod)? if not possible, how to exclude the pod of Postgresql from reinstallation (so i dont lose the existing conversations), and make this pod accessible from another server?
regarding the rasa deployment, in order to not lose the existing conversation on each deployment (postgres pod recreation), if i did the following:
1)- Disable the installation of Postgresql pod under
values.yml:
global:
postgresql:
install: “false”
2)- I have an existing database under Rasa local server containing the events table (Rasa x), and other custom. How to force the deployment to use this DB on Rasa local (other machine)? I appreciate if you can provide config sample please.
@paolo_1st please excuse the delay in my response – I was out.
To clarify: it seems like you are asking two different questions, one about the language model featurizer files and one about postgresql?
What issues did you have with the downloaded language model featurizer files? Did you get an error message or similar? Did you create a cache folder as specified by the parameter CACHE_DIR and move the files in there?
Could you explain the postgresql issue further? Are you asking how to use the database of a local server?
Also, in the future it is best to create a new topic for an unrelated question! This helps keep things organised and makes it easier for us to route the question the the correct person.
@fkoerner don’t worry. I didn’t pay attention for the two different topics. Next time, i will separate.
1)- regarding the language model, i already tried two approaches without success:
Using GIT, i included the cache directory within the pull/push from Rasa local to server. Although the cache directory was there on Rasa server, it started the download and kept hanging.
i faced the same problem when i downloaded the files on Rasa local using the script you provided, and moved these files to the cache dir of Rasa server.
2)- Regarding postgresql, Using helm chart and the below values file, i don’t want to create postgresql pod (install: false, existinghost…) and to use always the same password (for postgres user as i did below), but it’s generating always a random password. Note that i installed postgresql server on the host itself hosting rasa server, to be the database of Rasa (so i don’t need to import/export old conversations on each deployment).
Also, I can’t tell from the formatting above but I think you need to move your postgres definition. There are two spots you need to configure: at the top level where you set the existing host and install to false, and inside global where you define the credentials.
1)- Either downloading the L.M.F on rasa local, or the script you provided, i m getting the same result (random files as u can see below), i don’t which file belong to which file above (to rename as u requested above):
2)- regarding postgresql, the purpose is to prevent Kubernetes from installing postgresql pod, and use my current existing database (postgresql on the same server).
You shouldn’t rename those files – they are not random but hashes that transformers parses. Don’t rename them, just move them into the cache folder. Were they all downloaded successfully locally?
Yes, I created cache folder under the root folder of rasa server (same structure as local rasa app),and moved the downloaded files there. It seems these are files are being ignored during the deployment. How can I force deployment to use cache folder during quick install deployment?
the files with their original names (so this cache-like name) in a folder on the server
point to this folder with the cache_dir argument
If you have this – is it possible that the cache folder isn’t being picked up? Could you set logging to info? You should be able to see messages from transformers about whether the cached files were found (either loading file {} from cache at... or loading file {} from ...
Have you tried this locally? Does it only fail during deployment?