Crashing rasa-x-event-service

Hello all,

I’m trying to deploy Rasa-X to our OpenShift cluster (v4.6).

I’m deploying using the Helm Template made available by Rasa. In the past we deployed version 1.6.3 of this Helm template with no problems.

Now I want to deploy the latest version (1.10.0 / appversion 0.39.1)

The first struggle was with the NGINX pod. I implemented a workaround for that: Nginx CrashLoopBackOff with 1.10.0 · Issue #191 · RasaHQ/rasa-x-helm · GitHub

Now i’m kept with a crashing event-service pod. I’m deploying to a new, empty namespace.

Ouput of the pod logs:

Unable to get database revision heads. DB revision(s) do not match migration scripts revision(s): DB revision: None Migration scripts revision: [‘6f9d9810a4e1’] Database revision does not match migrations’ latest, trying again in 4 seconds. Unable to get database revision heads. DB revision(s) do not match migration scripts revision(s): DB revision: None Migration scripts revision: [‘6f9d9810a4e1’] Database revision does not match migrations’ latest, trying again in 4 seconds. Unable to get database revision heads. DB revision(s) do not match migration scripts revision(s): DB revision: None Migration scripts revision: [‘6f9d9810a4e1’]

I’ve attached the provided Helm override file with this post. **overrides.txt (3.6 KB) **

Because this a ‘locked down’ OpenShift Cluster we import the images to an internal registry. So we point to the correct images and repo. We set some resource quota. And we set the passwords.

Hmm i’m noticing that the subcharts postgresql, rabbitmq & redis are not resulting in extra pods…

When i look at our old (1.6) deployment i’ve got: app, duckling, event-service, nginx, postgresql, rabbit, rasa-production, rasa-worker, rasa-x and redis

In this new deploy (1.10) ive only got: app, db-migration, duckling, event-service, nginx, rasa-production, rasa-worker & rasa-x…

OK, due secruitycontext issues (OpenShift) redis/rabbit and postgresql werent deployed. After disabling the securityContext I was able to depploy succefullly.

Now some new issues:

Rabbit pod:

Readiness probe failed: Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'. Arguments given: node_health_check e[1mUsagee[0m rabbitmqctl [--node <node>] [--longnames] [--quiet] node_health_check [--timeout <timeout>] Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'. Arguments given: status e[1mUsagee[0m rabbitmqctl [--node <node>] [--longnames] [--quiet] status [--unit <unit>] [--timeout <timeout>]

Event service pod:

Liveness probe failed: Get "http://10.131.7.243:5673/health": dial tcp 10.131.7.243:5673: connect: connection refused