Encoding error when running Rasa

I am trying to install Rasa and Rasa X in my coworker machine.
We installed Rasa on her machine and pulled the project files from our repository (nlu, domain, etc…) I had pushed the files from my machine, where they are working properly.

When we try to run the “rasa train” command, we this is the error we get: Traceback (most recent call last): File “c:\users\bad_n\appdata\local\programs\python\python37\lib\runpy.py”, line 193, in run_module_as_main “main”, mod_spec) File “c:\users\bad_n\appdata\local\programs\python\python37\lib\runpy.py”, line 85, in run_code exec(code, run_globals) File "C:\Users\bad_n\AppData\Local\Programs\Python\Python37\Scripts\rasa.exe_main.py", line 9, in File "c:\users\bad_n\appdata\local\programs\python\python37\lib\site-packages\rasa_main.py", line 70, in main cmdline_arguments.func(cmdline_arguments) File “c:\users\bad_n\appdata\local\programs\python\python37\lib\site-packages\rasa\cli\train.py”, line 84, in train kwargs=extract_additional_arguments(args), File “c:\users\bad_n\appdata\local\programs\python\python37\lib\site-packages\rasa\train.py”, line 40, in train kwargs=kwargs, File “c:\users\bad_n\appdata\local\programs\python\python37\lib\asyncio\base_events.py”, line 584, in run_until_complete return future.result() File “c:\users\bad_n\appdata\local\programs\python\python37\lib\site-packages\rasa\train.py”, line 72, in train_async domain = Domain.load(domain, skill_imports) File “c:\users\bad_n\appdata\local\programs\python\python37\lib\site-packages\rasa\core\domain.py”, line 78, in load other = cls.from_path(path, skill_imports) File “c:\users\bad_n\appdata\local\programs\python\python37\lib\site-packages\rasa\core\domain.py”, line 92, in from_path domain = cls.from_file(path) File “c:\users\bad_n\appdata\local\programs\python\python37\lib\site-packages\rasa\core\domain.py”, line 105, in from_file return cls.from_yaml(rasa.utils.io.read_file(path)) File “c:\users\bad_n\appdata\local\programs\python\python37\lib\site-packages\rasa\utils\io.py”, line 125, in read_file return f.read() File “c:\users\bad_n\appdata\local\programs\python\python37\lib\codecs.py”, line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xe7 in position 302: invalid continuation byte

Searching the forum, I saw that this was called an encoding error, but I still cannot get what I should do to solve this problem.

Does anyone have an idea?

1 Like

Looks like one of the files used for training (domain, in this case) are bad/corrupt. Can you open the file in a text editor and check if there are non utf-8 characters?

Note that such a character may be a weird whitespace and hence difficult to detect. Typing file out again usually solves the issue.

1 Like

I used a version of the domain file that was commited and it worked ! We were able to train the bot and run it using ‘rasa shell’ and the ‘rasa interactive’ commands.

What I noted is that when we tried to run ‘rasa x’, it alters the domain file, and the same error appears again. Why is running rasa X editing my domain file?

I probably will open another tread for this issue.

I’m not so sure, but it probably has something to do with how Windows and Linux handle text differently.

I had the same issue. I found a non-UTF symbol and replaced it by using regex search:

[^\x00-\x7F]

Another solution was proposed here but it didn’t help me

I created a new topic for the domain file being edited on running the rasa x command: here

If you open nlu and domain files with Notepad++ and select UTF-8 in the Encoding menu item, you´ll see the invalid chars and will be able to fix them. BTW, do not use single quotes. Use “´” instead of “’”.