Rasa X CE Kubernetes Issue

I am trying to setup rasa x on a self-hosted kubernetes cluster. I am having issues with the talk page. I can type and send a message but I never see an intent listed under the message. If I refresh the page I can see the response from the bot above my message and the intent is then listed. When I open the conversations page the conversations do not load. The models and training pages work without any issues.

Here is my configmap.

kubectl describe configmap configuration-files
Name:         configuration-files
Namespace:    default
Labels:       <none>
Annotations:  <none>

Data
====
environments:
----
rasa:
    production:
      url: http://rasa-production:5005
      token: ${RASA_TOKEN}
    worker:
      url: http://rasa-worker:5005
      token: ${RASA_TOKEN}

rasa-credentials:
----
rasa:
    url: ${RASA_X_HOST}/api

rasa-endpoints:
----
models:
    url: ${RASA_MODEL_SERVER}
    token: ${RASA_X_TOKEN}
    wait_time_between_pulls: ${RASA_MODEL_PULL_INTERVAL}
tracker_store:
    type: sql
    dialect: "postgresql"
    url: ${DB_HOST}
    port: ${DB_PORT}
    username: ${DB_USER}
    password: ${DB_PASSWORD}
    db: ${DB_DATABASE}
    login_db: ${DB_LOGIN_DB}
event_broker:
    type: "pika"
    url: ${RABBITMQ_HOST}
    username: ${RABBITMQ_USERNAME}
    password: ${RABBITMQ_PASSWORD}
    queue: ${RABBITMQ_QUEUE}
action_endpoint:
    url: ${RASA_USER_APP}/webhook
    token:  ""

Events:  <none>

All pods are running.

kubectl get pods
NAME                                                         READY   STATUS    RESTARTS   AGE
api-69bd97474d-mfd6c                                         1/1     Running   1          20h
app-5bd5dcfb4b-7m9rv                                         1/1     Running   0          35m
db-7ffb94ccf9-tz48c                                          1/1     Running   0          35m
duckling-685df689dd-989jb                                    1/1     Running   0          28m
nginx-858c89d576-xrdjj                                       1/1     Running   0          27m
rabbit-698f496497-8nq99                                      1/1     Running   0          27m
rasa-production-98cd95644-txwdq                              1/1     Running   0          27m
rasa-worker-7c85984764-wn9rx                                 1/1     Running   0          27m
rasa-x-77594cb8f5-dzzhn                                      1/1     Running   0          27m
wayfaring-arachnid-nfs-client-provisioner-7bdf6bcf8f-ccf8l   1/1     Running   14         6d23h

There does appear to be an issue in the db pod logs.

kubectl logs db-7ffb94ccf9-tz48c

Welcome to the Bitnami postgresql container
Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-postgresql
Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-postgresql/issues
Send us your feedback at containers@bitnami.com

INFO  ==> ** Starting PostgreSQL setup **
INFO  ==> Validating settings in POSTGRESQL_* env vars..
INFO  ==> Initializing PostgreSQL database...
INFO  ==> postgresql.conf file not detected. Generating it...
INFO  ==> pg_hba.conf file not detected. Generating it...
INFO  ==> Starting PostgreSQL in background...
/tmp:5432 - accepting connections
INFO  ==> Creating user admin
INFO  ==> Grating access to "admin" to the database "rasa"
INFO  ==> Configuring replication parameters
INFO  ==> Loading custom scripts...
INFO  ==> Enabling remote connections
INFO  ==> Stopping PostgreSQL...
INFO  ==> ** PostgreSQL setup finished! **

INFO  ==> ** Starting PostgreSQL **
2019-07-22 03:19:27.253 GMT [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2019-07-22 03:19:27.253 GMT [1] LOG:  listening on IPv6 address "::", port 5432
2019-07-22 03:19:27.276 GMT [1] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-07-22 03:19:27.414 GMT [186] LOG:  database system was shut down at 2019-07-22 03:19:26 GMT
2019-07-22 03:19:27.452 GMT [1] LOG:  database system is ready to accept connections
2019-07-22 03:20:01.545 GMT [193] LOG:  invalid length of startup packet
2019-07-22 03:21:01.545 GMT [195] LOG:  invalid length of startup packet
2019-07-22 03:22:01.546 GMT [197] LOG:  invalid length of startup packet
2019-07-22 03:23:01.547 GMT [199] LOG:  invalid length of startup packet
2019-07-22 03:24:01.548 GMT [201] LOG:  invalid length of startup packet
2019-07-22 03:24:51.917 GMT [203] WARNING:  there is no transaction in progress
2019-07-22 03:25:01.548 GMT [206] LOG:  invalid length of startup packet
2019-07-22 03:25:04.136 GMT [204] WARNING:  there is no transaction in progress
2019-07-22 03:26:01.551 GMT [215] LOG:  invalid length of startup packet
2019-07-22 03:27:01.554 GMT [221] LOG:  invalid length of startup packet
2019-07-22 03:28:01.557 GMT [226] LOG:  invalid length of startup packet
2019-07-22 03:29:01.564 GMT [232] LOG:  invalid length of startup packet
2019-07-22 03:30:01.571 GMT [237] LOG:  invalid length of startup packet
2019-07-22 03:31:01.578 GMT [242] LOG:  invalid length of startup packet
2019-07-22 03:32:01.584 GMT [248] LOG:  invalid length of startup packet
2019-07-22 03:33:01.589 GMT [253] LOG:  invalid length of startup packet
2019-07-22 03:34:01.594 GMT [258] LOG:  invalid length of startup packet
2019-07-22 03:35:01.599 GMT [263] LOG:  invalid length of startup packet
2019-07-22 03:36:01.607 GMT [269] LOG:  invalid length of startup packet
2019-07-22 03:37:01.607 GMT [274] LOG:  invalid length of startup packet
2019-07-22 03:38:01.611 GMT [282] LOG:  invalid length of startup packet
2019-07-22 03:39:01.614 GMT [287] LOG:  invalid length of startup packet
2019-07-22 03:40:01.617 GMT [292] LOG:  invalid length of startup packet
2019-07-22 03:41:01.620 GMT [297] LOG:  invalid length of startup packet
2019-07-22 03:42:01.622 GMT [302] LOG:  invalid length of startup packet
2019-07-22 03:43:01.625 GMT [307] LOG:  invalid length of startup packet
2019-07-22 03:44:01.627 GMT [312] LOG:  invalid length of startup packet
2019-07-22 03:45:01.634 GMT [317] LOG:  invalid length of startup packet
2019-07-22 03:46:01.632 GMT [322] LOG:  invalid length of startup packet
2019-07-22 03:47:01.634 GMT [327] LOG:  invalid length of startup packet
2019-07-22 03:48:01.636 GMT [332] LOG:  invalid length of startup packet
2019-07-22 03:49:01.637 GMT [337] LOG:  invalid length of startup packet
2019-07-22 03:50:01.639 GMT [342] LOG:  invalid length of startup packet
2019-07-22 03:51:01.642 GMT [347] LOG:  invalid length of startup packet
2019-07-22 03:52:01.643 GMT [352] LOG:  invalid length of startup packet

Hi @nkane2898, welcome to the forum! Which version of rasa X are you running?

Ricwo,

I am using the 0.19.5 container image.

Name:           app-5bd5dcfb4b-w5lbr
Image:          rasa/rasa-x-demo:latest

Name:           db-7ffb94ccf9-wbqm7
Image:          bitnami/postgresql:11.2.0

Name:           duckling-685df689dd-989jb
Image:         rasa/duckling:latest

Name:           nginx-858c89d576-lzqpb
Image:          rasa/nginx:0.19.5

Name:           rabbit-698f496497-7wl6j
Image:          bitnami/rabbitmq:3.7.15

Name:           rasa-production-98cd95644-fsfvr
Image:         rasa/rasa:1.1.7-full

Name:           rasa-worker-7c85984764-gv65c
Image:         rasa/rasa:1.1.7-full

Name:           rasa-x-77594cb8f5-qvsfv
Image:          rasa/rasa-x:0.19.5

@nkane2898 how did you generate your deployment files? Would you mind sharing the contents of each of your deployment configs?

I followed the steps here Openshift-kubernetes. I used the compose file from here Docker Compose CE. Attached is a zip folder of the configmap, deployments, services, and persistent volume claims generated by kompose. I have removed the passwords from the files. rasa-kubernetes.zip (12.7 KB)

Hi @nkane2898, the instructions you found are mainly geared towards openshift and we’re aware of some issues that may come up if they’re applied literally to kubernetes.

I cannot see anything obviously wrong with your config files. We’re working on redesigning the cluster deployment process using helm charts at the moment. I will try to ping you here once that’s available

Thank you for the update. I’ve been running the rasa/rasa-x:0.20.0, rasa/nginx:0.20.0, and rasa/rasa:1.1.7-full images for the past couple of days which appears to have resolved the issue.

I am trying the new helm chart without changing any of the defaults. It appears the database password is not being set.

Loading…/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_qint8 = np.dtype([(“qint8”, np.int8, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_quint8 = np.dtype([(“quint8”, np.uint8, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_qint16 = np.dtype([(“qint16”, np.int16, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_quint16 = np.dtype([(“quint16”, np.uint16, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_qint32 = np.dtype([(“qint32”, np.int32, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. np_resource = np.dtype([(“resource”, np.ubyte, 1)]) /usr/local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_qint8 = np.dtype([(“qint8”, np.int8, 1)]) /usr/local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_quint8 = np.dtype([(“quint8”, np.uint8, 1)]) /usr/local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_qint16 = np.dtype([(“qint16”, np.int16, 1)]) /usr/local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_quint16 = np.dtype([(“quint16”, np.uint16, 1)]) /usr/local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. _np_qint32 = np.dtype([(“qint32”, np.int32, 1)]) /usr/local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’. np_resource = np.dtype([(“resource”, np.ubyte, 1)]) WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensor2tensor/utils/expert_utils.py:68: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensor2tensor/utils/adafactor.py:27: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensor2tensor/utils/multistep_optimizer.py:32: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/mesh_tensorflow/ops.py:4237: The name tf.train.CheckpointSaverListener is deprecated. Please use tf.estimator.CheckpointSaverListener instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/mesh_tensorflow/ops.py:4260: The name tf.train.SessionRunHook is deprecated. Please use tf.estimator.SessionRunHook instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensor2tensor/models/research/neural_stack.py:38: The name tf.nn.rnn_cell.RNNCell is deprecated. Please use tf.compat.v1.nn.rnn_cell.RNNCell instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensor2tensor/rl/gym_utils.py:235: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensor2tensor/utils/trainer_lib.py:111: The name tf.OptimizerOptions is deprecated. Please use tf.compat.v1.OptimizerOptions instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow_gan/python/contrib_utils.py:305: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow_gan/python/contrib_utils.py:310: The name tf.estimator.tpu.TPUEstimatorSpec is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimatorSpec instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/rasa/utils/train_utils.py:28: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/rasa/core/policies/keras_policy.py:65: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

INFO:rasax.community.services.event_service:Start consuming pika host ‘rabbit’ Traceback (most recent call last): File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py”, line 2262, in _wrap_pool_connect return fn() File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 303, in unique_connection return _ConnectionFairy._checkout(self) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 760, in _checkout fairy = _ConnectionRecord.checkout(pool) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 492, in checkout rec = pool._do_get() File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py”, line 139, in _do_get self._dec_overflow() File “/usr/local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py”, line 68, in exit compat.reraise(exc_type, exc_value, exc_tb) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py”, line 129, in reraise raise value File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py”, line 136, in _do_get return self._create_connection() File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 308, in _create_connection return _ConnectionRecord(self) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 437, in init self.__connect(first_connect_check=True) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 639, in __connect connection = pool._invoke_creator(self) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py”, line 114, in connect return dialect.connect(*cargs, **cparams) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py”, line 453, in connect return self.dbapi.connect(*cargs, **cparams) File “/usr/local/lib/python3.6/site-packages/psycopg2/init.py”, line 126, in connect conn = _connect(dsn, connection_factory=connection_factory, **kwasync) psycopg2.OperationalError: FATAL: password authentication failed for user “admin”

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File “/usr/local/lib/python3.6/runpy.py”, line 193, in _run_module_as_main “main”, mod_spec) File “/usr/local/lib/python3.6/runpy.py”, line 85, in _run_code exec(code, run_globals) File “/usr/local/lib/python3.6/site-packages/rasax/community/server.py”, line 90, in main() File “/usr/local/lib/python3.6/site-packages/rasax/community/server.py”, line 30, in main sql_migrations.run_migrations(session) File “/usr/local/lib/python3.6/site-packages/rasax/community/sql_migrations.py”, line 25, in run_migrations _run_schema_migrations(session) File “/usr/local/lib/python3.6/site-packages/rasax/community/sql_migrations.py”, line 50, in _run_schema_migrations command.upgrade(alembic_config, “head”) File “/usr/local/lib/python3.6/site-packages/alembic/command.py”, line 276, in upgrade script.run_env() File “/usr/local/lib/python3.6/site-packages/alembic/script/base.py”, line 475, in run_env util.load_python_file(self.dir, “env.py”) File “/usr/local/lib/python3.6/site-packages/alembic/util/pyfiles.py”, line 90, in load_python_file module = load_module_py(module_id, path) File “/usr/local/lib/python3.6/site-packages/alembic/util/compat.py”, line 156, in load_module_py spec.loader.exec_module(module) File “”, line 678, in exec_module File “”, line 219, in _call_with_frames_removed File “/usr/local/lib/python3.6/site-packages/rasax/community/database/schema_migrations/alembic/env.py”, line 85, in run_migrations_online() File “/usr/local/lib/python3.6/site-packages/rasax/community/database/schema_migrations/alembic/env.py”, line 67, in run_migrations_online with connectable.connect() as connection: File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py”, line 2193, in connect return self._connection_cls(self, **kwargs) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py”, line 103, in init else engine.raw_connection() File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py”, line 2293, in raw_connection self.pool.unique_connection, _connection File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py”, line 2266, in _wrap_pool_connect e, dialect, self File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py”, line 1536, in _handle_dbapi_exception_noconnection util.raise_from_cause(sqlalchemy_exception, exc_info) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py”, line 383, in raise_from_cause reraise(type(exception), exception, tb=exc_tb, cause=cause) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py”, line 128, in reraise raise value.with_traceback(tb) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py”, line 2262, in _wrap_pool_connect return fn() File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 303, in unique_connection return _ConnectionFairy._checkout(self) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 760, in _checkout fairy = _ConnectionRecord.checkout(pool) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 492, in checkout rec = pool._do_get() File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py”, line 139, in _do_get self._dec_overflow() File “/usr/local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py”, line 68, in exit compat.reraise(exc_type, exc_value, exc_tb) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py”, line 129, in reraise raise value File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py”, line 136, in _do_get return self._create_connection() File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 308, in _create_connection return _ConnectionRecord(self) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 437, in init self.__connect(first_connect_check=True) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py”, line 639, in __connect connection = pool._invoke_creator(self) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py”, line 114, in connect return dialect.connect(*cargs, **cparams) File “/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py”, line 453, in connect return self.dbapi.connect(*cargs, **cparams) File “/usr/local/lib/python3.6/site-packages/psycopg2/init.py”, line 126, in connect conn = _connect(dsn, connection_factory=connection_factory, **kwasync) sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: password authentication failed for user “admin”

(Background on this error at: http://sqlalche.me/e/e3q8)

Hi @nkane2898, would you mind providing some more background:

  1. what platform are you running this on?
  2. which version of rasa X are you trying to install?
  3. are you following the helm-deployment instructions (OpenShift and Kubernetes) step-by-step, or is there anything custom / unusual about your setup process?

Thanks!

Ricwo,

  1. I am running a self hosted v1.16.0 kubernetes cluster.
  2. I have tried using 0.21.2, 0.21.3, and 0.21.4
  3. Here are the commands I have used for helm. I am running helm v2.14.3.

export RASA_X_VERSION=0.21.4

wget -qO rasa-x-helm.tgz https://storage.googleapis.com/rasa-x-releases/${RASA_X_VERSION}/rasa-x-${RASA_X_VERSION}.tgz

helm install rasa-x-helm.tgz

It appears the database does not have an admin user. Attached are the logs from the db and rasa-x pods.pod-logs.txt (10.9 KB)

Thanks @nkane2898, we’ve just made some improvements to our helm chart which will be released soon with 0.22.0. In the meantime, we’ll have a look at your logs and check what might be going wrong.