Rasa X loses connection with gitlab repo

Hi there,

I’ve setup rasa-x on a gke kubernetes cluster (using the rasa-x-helm chart) and I’ve followed the instructions and have successfully setup integrated version control against my gitlab repo containing my rasa bot. Once I got things setup, it works well, but I’ve run into an issue where over a period of time the UI shows that rasa-x is not connected to the gitlab repo. I believe I’ve traced this down to this error that I see in the rasa-x logs:

2020-03-06 09:59:27.513 EST Exception occurred while handling uri: 'http://rasa-x.alotdone.com/api/projects/default/git_repositories/6/status'
2020-03-06 09:59:27.513 EST Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/sanic/app.py", line 942, in handle_request response = await response File "/usr/local/lib/python3.6/site-packages/rasax/community/api/decorators.py", line 177, in decorated_function return await await_and_return_response(args, kwargs, request) File "/usr/local/lib/python3.6/site-packages/rasax/community/api/decorators.py", line 107, in await_and_return_response response = await response File "/usr/local/lib/python3.6/site-packages/rasax/community/api/blueprints/git.py", line 145, in get_repository_status repository_status = git_service.get_repository_status() File "/usr/local/lib/python3.6/site-packages/rasax/community/services/git_service.py", line 655, in get_repository_status is_remote_ahead = self.is_remote_branch_ahead() File "/usr/local/lib/python3.6/site-packages/rasax/community/services/git_service.py", line 483, in is_remote_branch_ahead self._repository.git.fetch() File "/usr/local/lib/python3.6/site-packages/git/cmd.py", line 545, in <lambda> return lambda *args, **kwargs: self._call_process(name, *args, **kwargs) File "/usr/local/lib/python3.6/site-packages/git/cmd.py", line 1014, in _call_process return self.execute(call, **exec_kwargs) File "/usr/local/lib/python3.6/site-packages/git/cmd.py", line 825, in execute raise GitCommandError(command, status, stderr_value, stdout_value) git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
2020-03-06 09:59:27.513 EST cmdline: git fetch
2020-03-06 09:59:27.513 EST stderr: 'Could not create directory '/root/.ssh'.
2020-03-06 09:59:27.513 EST Warning: Permanently added 'gitlab.com,35.231.145.151' (ECDSA) to the list of known hosts.
2020-03-06 09:59:27.513 EST @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
2020-03-06 09:59:27.513 EST @ WARNING: UNPROTECTED PRIVATE KEY FILE! @
2020-03-06 09:59:27.513 EST @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
2020-03-06 09:59:27.513 EST Permissions 0660 for '/app/git/ssh_files/6.key' are too open.
2020-03-06 09:59:27.513 EST It is required that your private key files are NOT accessible by others.
2020-03-06 09:59:27.513 EST This private key will be ignored.
2020-03-06 09:59:27.513 EST Load key "/app/git/ssh_files/6.key": bad permissions
2020-03-06 09:59:27.513 EST git@gitlab.com: Permission denied (publickey).
2020-03-06 09:59:27.513 EST fatal: Could not read from remote repository.
2020-03-06 09:59:27.513 EST
2020-03-06 09:59:27.513 EST Please make sure you have the correct access rights
2020-03-06 09:59:27.513 EST and the repository exists.'

This seemed curious since it was working initially when I POST’d the repository.json file as per the instructions. I then looked at the permissions of the “/app/git/ssh_files/6.key” within my rasa-x container and sure enough they were set to 660. I then manually did a “chmod 600 /app/git/ssh_files/6.key” and rasa-x was happy again and could sync with the gitlab repo.

So I at least know what the issue is, but I don’t don’t why it continues to happen. Has anyone else seen this behavior before?

Thanks in advance.

1 Like

I’m struggling with a similar issue. Did you get to solve it?

I don’t have a solution yet as I’m still unsure of what’s causing the permissions to be set back to 660 on the git/ssh_files/6.key file within my rasa-x container. To be clear and reiterate what I’m seeing…

  • When I login to rasa x in the morning I see it’s disconnected from git.
  • I kubectl exec... into my rasa-x pod and see the permission of the git/ssh_files/6.key file are 660
  • Within the pod, I run chmod 600 git/ssh_files/6.key
  • Reload rasa x ui and see that it can sync with get again.

One thing that is a potential issue is that I’m using GKE “preemptible” nodes which essentially can be restarted at any time and after max of 24 hours of use (this is a test setup atm and I’m using those nodes to reduce cost). I don’t know for certain that this is the cause (and I’ve not seen any other odd behavior of other components running in my gke cluster) but it’s the only thing outside of rasa/rasa-x itself that I could think of as a potential culprit. If I can narrow down further what the root cause is for me, I’ll definitely share here. In the meantime, I have the (slightly annoying) manual workaround above.

For me the problem came when I updated rasa-x to 0.26.

Now I disconnected rasa X from github, but I’m still getting the error

 Could not create directory '/root/.ssh'.
rasa-x_1           | Warning: Permanently added 'github.com,140.82.118.4' (RSA) to the list of known hosts.
rasa-x_1           | git@github.com: Permission denied (publickey).
rasa-x_1           | fatal: Could not read from remote repository.
rasa-x_1           |
rasa-x_1           | Please make sure you have the correct access rights
rasa-x_1           | and the repository exists

What is causing this?

  • Are you running a custom Rasa X Docker image?
  • Did you add the keys via UI or API endpoint?
  • Anything which was configured in terms of the user who is running this?

One thing that is a potential issue is that I’m using GKE “preemptible” nodes which essentially can be restarted at any time and after max of 24 hours of use (this is a test setup atm and I’m using those nodes to reduce cost)

I think this shouldn’t be an issue :thinking:

I just upgraded to rasa-x 0.26.0 and I was able to use the UI to disconnect and then reconnect to my gitlab repo. So this definitely seems like an improvement. I’ll continue to monitor to see if I permission issue I saw previously is gone as well. Thanks!