Problems with the encoding in the domain file

Good afternoon, I am currently migrating to the RASA 1 version the bots that we have developed in the company where I work, and we wanted to try the Rasa X tool on one of our best bots. However, by giving it the structure that Rasa X needs, it throws me error in the encoding and the culprit is the domain file.

Starting Rasa X in local mode... πŸš€
Traceback (most recent call last):
  File "c:\byte\rasa1\lib\site-packages\rasa\cli\x.py", line 322, in run_locally
    local.main(args, project_path, args.data, token=rasa_x_token)
  File "c:\byte\rasa1\lib\site-packages\rasax\community\local.py", line 190, in main
    project_path, data_path, session, args.port
  File "c:\byte\rasa1\lib\site-packages\rasax\community\local.py", line 127, in _initialize_with_local_data
    COMMUNITY_USERNAME,
  File "c:\byte\rasa1\lib\site-packages\rasax\community\initialise.py", line 140, in inject_domain
    username=username,
  File "c:\byte\rasa1\lib\site-packages\rasax\community\services\domain_service.py", line 127, in validate_and_store_domain_yaml
    self.dump_domain_in_local_mode()
  File "c:\byte\rasa1\lib\site-packages\rasax\community\services\domain_service.py", line 140, in dump_domain_in_local_mode
    self._dump_domain(filename)
  File "c:\byte\rasa1\lib\site-packages\rasax\community\services\domain_service.py", line 148, in _dump_domain
    dump_yaml_to_file(filename, cleaned_domain)
  File "c:\byte\rasa1\lib\site-packages\rasax\community\utils.py", line 354, in dump_yaml_to_file
    f.write(dump_yaml(content))
  File "c:\byte\rasa1\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u231b' in position 10322: character maps to <undefined>

Sorry, something went wrong (see error above). Make sure to start Rasa X with valid data and valid domain and config files. Please, also check any warnings that popped up.
If you need help fixing the issue visit our forum: https://forum.rasa.com/.

The reason that this error is shown is because my domain has emojis and accent marks

Is there any way to configure that encoding? I would not like to remove emojis and accents to make it work (because I tried it without emojis and accents and if it worked)

Hey @idusertbs, what exactly does your domain file look like? In general emojis are supported in domain.yml, e.g. in the templates section:

image

leading to

In the version from rasa x 0.19.2. If you put the flag use_entities: false, with your intent then rasa-x won’t work

example: - enter_data: {use_entities: false}

Hi, this is a part of my domain file:

Thanks @idusertbs. This looks like it’s a Windows-related issue. Which OS and python version are you running?

@idusertbs @ricwo

I’ve had the same problem a couple of times with my stories and domain files. This was because I used Notepad++ on windows, while I trained the model on a Linux PC. What happened was that Notepad++ encodes the file in a Windows codec, with windows line breaks. I had to change both to UTF-8 codec and Unix (LF) respectively.

So my tip is to check your domain file, what codec it is written in and what line breaks you are using.