Interactive learning not working (disk space issue ?)

Hello,

I built and trained a simple rasa bot on my PC, with just intent recognition, no entities, no custom action server, and simple utter actions for each intent. Everything worked perfectly in the shell, so I tried to setup rasa x on an Ubuntu server with 150 Gb. I installed manually with Docker, and everything was fine except that I had to change the rasa x port from 80 to 8080. I was able to connect without any problem. I synched a bitbucket repo with my model, it worked. I even tried the model trained in the Rasa X interface in shell and it works.

However, when I went to the “talk to your bot” tab, I had a bug : whatever I wrote, he would identify the content of my message as the intent (if I typed erognbdk he would identify erognbdk), and not answer anything (three dots).

Later, I wasn’t able to connect to Rasa X (Request failed with status code 500), and I found out that Rasa X used all the server disk space. I increased disk space from 150 to 200 Gb and was again able to connect, but now Rasa X is again taking up all the space. I should mention that my project is quite small, except maybe that I’m using spacy fr_core_news_md and not sm.

In db logs I have this :

mktemp: failed to create file via template ‘/tmp/tmp.XXXXXXXXXX’: No space left on device
mktemp: failed to create file via template ‘/tmp/tmp.XXXXXXXXXX’: No space left on device
/opt/bitnami/scripts/libpostgresql.sh: line 49: POSTGRESQL_DATA_DIR: unbound variable
/opt/bitnami/scripts/libpostgresql.sh: line 103: cannot create temp file for here-document: No space left on device
/opt/bitnami/scripts/libpostgresql.sh: line 164: cannot create temp file for here-document: No space left on device
/opt/bitnami/scripts/libpostgresql.sh: line 177: cannot create temp file for here-document: No space left on device
/opt/bitnami/scripts/libpostgresql.sh: line 186: cannot create temp file for here-document: No space left on device
/opt/bitnami/scripts/libpostgresql.sh: line 195: cannot create temp file for here-document: No space left on device
/opt/bitnami/scripts/libpostgresql.sh: line 200: cannot create temp file for here-document: No space left on device

and similar space errors in several containers.

I was using Rasa X 0.29.0, then tried to update it to 0.29.3, the update didn’t seem to finish but now it says 0.29.3 ; both bugs appeared before and after the update. The rasa version is 1.10.0 and SDK says “latest”.

I have two questions :

  1. Are both problems linked (ie if I buy more space interactive learning will work) ?
  2. Is it normal that Rasa X is taking up so much space ? The requirements are 100 Gb.

Thank you for your help !

Short answer : No, Rasa X should not take that much space, be careful if you uploaded yourself a model because training didn’t work, there seems to be some unending process filling all the space up

(Not so) long answer : problem somewhat solved, the space was getting filled someway because I uploaded a model instead of training one. I had done this to circumvent the lack of spacy model in rasa x, but it seems that the spacy model is also needed to use a model. Therefore, after recreating docker containers, I went inside the rasa-worker and rasa-production containers to manually add the spacy model.

docker exec -u root -it rasa_rasa-worker_1 /bin/bash
/app# python -m spacy download fr_core_news_md
/app# python -m spacy link fr_core_news_md fr
docker exec -u root -it rasa_rasa-production_1 /bin/bash
/app# python -m spacy download fr_core_news_md
/app# python -m spacy link fr_core_news_md fr

Now SpaCy works perfectly, however I have to do the above everytime I recreate the docker containers.

@pofenstein I am running into the same issue. Can you explain how you stopped the invisible process? You mentioned “recreating” the docker containers, what did you exactly do? And did you encounter this problem ever since? I am planning to only use locally trained models and upload them, so I guess I should just stay off the training button for the time being…

Hello @Taufred

I read your other post and I can confirm that you have the same issue I had.

I didn’t understand exactly what was taking up all the space. However, I believe it was caused by uploading a locally trained model on the platform, which used components in the pipeline (config.yml) not supported on the rasa-worker container.

I think you shouldn’t use a locally trained model if the training button doesn’t work, as rasa-production which does the training and rasa-worker which processes messages are based on the same docker image. If one doesn’t work the other won’t either.

Did you had any setup to do for you components locally ? (apart from pip install rasa[...] which are automatically loaded on rasa-worker and rasa-production with pip install rasa[all] or something like that)

If so, you can execute the same commands on rasa-production and rasa-worker and try the training button to see if it works. You can access the containers with docker exec -u root -it <container_name> /bin/bash You can also try a simpler pipeline to see if the training works.

Later you should add those commands to the docker image, so that they run automatically when you create the containers (docker-compose up), else you have to run the commands everytime you restart the containers (docker-compose down and then docker-compose up). I have not yet done this.

1 Like

You are correct, I installed some packages locally. How would I add the commands to the docker image? I assume via some command in the docker-compose.override.yml file? I sadly can not even create the containers anymore, because I am getting errors of modules missing.

EDIT: I used the wrong command to launch the containers… docker-compose up instead of docker compose up -d. I executed the commands in the containers like you did, so the question remains how to add those to the docker image.
Anyway, the training works now but the interactive learning does not predict anything at all! However the disk is not filling up anymore, so it must be another issue. Thanks so much for your help so far @pofenstein!

I don’t know yet how to modify the docker images, if you find how to do it I’d be glad if you could share it !

Meanwhile, you can also try to train your bot on the platform with a config that doesn’t need packages to see if it predicts correctly the intent. Here is an example :

language: en #or your language code
pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    entity_recognition: false
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
policies:
  - name: MappingPolicy
  - name: TEDPolicy
    epochs: 100
  - name: MemoizationPolicy
    max_history: 1
  - name: FallbackPolicy
    nlu_threshold: 0.4
    core_threshold: 0.3

If there is no prediction something must be wrong with rasa-worker, check the logs with docker logs rasa-worker

You are right, the problem probably lies within my custom component. It needs to download a resource within a python command nltk.download('vader_lexicon'). I can run that command within a python console within a container, but the component module still does not find it. I can not let the module download the resource itself, since it does not have access to the file system… Did you have a similar issue when manually adding your model?