Diffs between git and nlu data displayed and used by Rasa X

Hi, I ve connected my rasa x instance to a git repo, but after a my last sync I see a lot of differences between the nlu data from git and what is displayed in the rasa x UI … problem is I cannot sync or train a model any longer … I checked the git folder in der docker container, which exactly mirrors the git.

my question: where does rasa x store the other files (the data that is displayed in the UI) … I would like to clean them up, if that is possible

I’m using Rasa 1.0.1 btw

@spf If you have pushed the data in Git repo and your repo is connected with Rasa X, then the only known the issue is domain.yml in that intents and slots synchronized according to amending order of the alphabets or even add some default syntax. On the other hand, if you trained the model based on your git repo and if you have updated something, you will see that it will give you notification for update the git server, so you need to update the same and on git repo, you need to merge that branch as it will not push on the main thread.

@spf If your issue persists, I’d recommend disconnecting the Git repo with rasa x and then generate the new deploy key and connecting again with rasa x and train.

@spf even you can delete the cache or cookies [I’m not it will be helpful or not]

Please keep me in loop as you troubleshoot. Good Luck!

@nik202 thanks for the quick response … but unfortunately this does not solve the problem … I reconnected the git repo (no problems, the git files are synced exactly as in the repo) … BUT … rasa-x tries and fails to validate a corrupt story file that is not part of the repo … afterwards it seems like rasa-x stops reading any of the files of the repo … this corrupt file (or information) must be located somewhere, may be in the database?

I have a log trace that shows the issue:

[2022-01-11 09:25:03 +0000] - (sanic.access)[INFO][REDACTED]: GET http://REDACTED/api/git-repositories/3/status 200 240
rasa-x_1 | ERROR:rasax.community.utils.common:Could not complete the coroutine: ‘’.
rasa-x_1 | Traceback (most recent call last):
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasa/shared/utils/validation.py”, line 170, in validate_yaml_schema
rasa-x_1 | c.validate(raise_exception=True)
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/pykwalify/core.py”, line 194, in validate
rasa-x_1 | raise SchemaError(u"Schema validation failed:\n - {error_msg}.".format(
rasa-x_1 | pykwalify.errors.SchemaError: <SchemaError: error code 2: Schema validation failed:
rasa-x_1 | - Value '[ordereddict([(‘intent’, ‘REDACTED’)]), ordereddict([(‘action’, ‘REDACTED’)]), … ’ is not of type ‘str’. Path: ‘/stories/22/story’.: Path: ‘/’>
rasa-x_1 |
rasa-x_1 | During handling of the above exception, another exception occurred:
rasa-x_1 |
rasa-x_1 | Traceback (most recent call last):
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/story_service.py”, line 135, in _reader_read_from_string
rasa-x_1 | return reader.read_from_file(temp_path, skip_validation=skip_validation)
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasa/shared/core/training_data/story_reader/yaml_story_reader.py”, line 107, in read_from_file
rasa-x_1 | raise e
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasa/shared/core/training_data/story_reader/yaml_story_reader.py”, line 99, in read_from_file
rasa-x_1 | return self.read_from_string(
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasa/shared/core/training_data/story_reader/yaml_story_reader.py”, line 123, in read_from_string
rasa-x_1 | rasa.shared.utils.validation.validate_yaml_schema(string, CORE_SCHEMA_FILE)
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasa/shared/utils/validation.py”, line 172, in validate_yaml_schema
rasa-x_1 | raise YamlValidationException(
rasa-x_1 | rasa.shared.utils.validation.YamlValidationException: Failed to validate ‘/tmp/tmppaj69plq’. Please make sure the file is correct and all mandatory parameters are specified. Here are the errors found during validation:
rasa-x_1 | in /tmp/tmppaj69plq:343:
rasa-x_1 | Value '[ordereddict([(‘intent’, ‘REDACTED’)]), ordereddict([(‘action’, ‘REDACTED’)]), … ’ is not of type ‘str’. Path: ‘/stories/22/story’
rasa-x_1 |
rasa-x_1 | During handling of the above exception, another exception occurred:
rasa-x_1 |
rasa-x_1 | Traceback (most recent call last):
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/story_service.py”, line 217, in get_story_steps
rasa-x_1 | return StoryService._reader_read_from_string(
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/story_service.py”, line 137, in _reader_read_from_string
rasa-x_1 | raise StoryParseError(str(e))
rasa-x_1 | rasa.shared.core.training_data.story_reader.story_reader.StoryParseError
rasa-x_1 |
rasa-x_1 | During handling of the above exception, another exception occurred:
rasa-x_1 |
rasa-x_1 | Traceback (most recent call last):
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/utils/common.py”, line 886, in run_in_loop
rasa-x_1 | return loop.run_until_complete(coro)
rasa-x_1 | File “uvloop/loop.pyx”, line 1456, in uvloop.loop.Loop.run_until_complete
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/integrated_version_control/git_service.py”, line 1203, in synchronize_project
rasa-x_1 | return await self._force_inject_latest_remote_changes()
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/integrated_version_control/git_service.py”, line 1261, in _force_inject_latest_remote_changes
rasa-x_1 | await self._inject_data()
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/integrated_version_control/git_service.py”, line 1233, in _inject_data
rasa-x_1 | await rasax.initialise.inject_files_from_disk(
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/initialise.py”, line 454, in inject_files_from_disk
rasa-x_1 | await inject_stories(
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/initialise.py”, line 277, in inject_stories
rasa-x_1 | test_story_blocks = await story_service.save_stories_from_files(
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/story_service.py”, line 559, in save_stories_from_files
rasa-x_1 | additional_blocks = await self.save_stories(
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/story_service.py”, line 452, in save_stories
rasa-x_1 | processed_stories = await self._extract_stories_yaml(
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/story_service.py”, line 375, in _extract_stories_yaml
rasa-x_1 | steps_list = self.get_story_steps(
rasa-x_1 | File “/usr/local/lib/python3.8/dist-packages/rasax/community/services/story_service.py”, line 221, in get_story_steps
rasa-x_1 | raise StoryParseError(
rasa-x_1 | rasa.shared.core.training_data.story_reader.story_reader.StoryParseError

@nik202 Sorry for the unformatted log … I fixed this :smiley:

I’ve tried to find the corrupt story file (path: ‘/stories/22/story) without success

I’ve tried the following

  • recreate all pods: the error persists
  • search the rasa-x docker pod for a file: there isn’t
  • search the “story” table in the database: it’s not there either
  • deleted all story files from the “story” table in the database: the table got correctly refilled with the stories from the connected git
  • searched through other db tables that seemed relevant: nothing found

I’m really running out of options here

I could setup the whole thing from scratch, but I don’t want to loose my conversations :confused:

Any Ideas

Whoa … found it

It was a test story! which rasa-x tried to interpret as a training story.

my git repo structure looks like it should:

  • data
    • nly.yml
    • stories.yml
  • tests
    • test_stories.yml

Hope someone from rasa is listening, and the bug gets fixed