'charmap' codec can't decode byte 0x81 after 2.4.0 update

After updating to 2.4.0, without changing anything in the files, I get this error when training/validating:

Skipping registering GPU devices...
Traceback (most recent call last):
  File "E:\Program Files\Python\Python38\lib\runpy.py", line 192, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "E:\Program Files\Python\Python38\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "E:\...\_venv\lib\site-packages\rasa\__main__.py", line 134, in <module>
    main()
  File "E:\...\_venv\lib\site-packages\rasa\__main__.py", line 116, in main
    cmdline_arguments.func(cmdline_arguments)
  File "E:\...\_venv\lib\site-packages\rasa\cli\train.py", line 58, in <lambda>
    train_parser.set_defaults(func=lambda args: train(args, can_exit=True))
  File "E:\...\_venv\lib\site-packages\rasa\cli\train.py", line 90, in train
    training_result = rasa.train(
  File "E:\...\_venv\lib\site-packages\rasa\train.py", line 94, in train
    return rasa.utils.common.run_in_loop(
  File "E:\...\_venv\lib\site-packages\rasa\utils\common.py", line 307, in run_in_loop
    result = loop.run_until_complete(f)
  File "E:\Program Files\Python\Python38\lib\asyncio\base_events.py", line 608, in run_until_complete
    return future.result()
  File "E:\...\_venv\lib\site-packages\rasa\train.py", line 151, in train_async
    file_importer = TrainingDataImporter.load_from_config(
  File "E:\...\_venv\lib\site-packages\rasa\shared\importers\importer.py", line 85, in load_from_config
    return TrainingDataImporter.load_from_dict(
  File "E:\...\_venv\lib\site-packages\rasa\shared\importers\importer.py", line 150, in load_from_dict
    RasaFileImporter(
  File "E:\...\_venv\lib\site-packages\rasa\shared\importers\rasa.py", line 33, in __init__
    self._story_files = rasa.shared.data.get_data_files(
  File "E:\...\_venv\lib\site-packages\rasa\shared\data.py", line 152, in get_data_files
    new_data_files = _find_data_files_in_directory(path, filter_predicate)
  File "E:\...\_venv\lib\site-packages\rasa\shared\data.py", line 172, in _find_data_files_in_directory
    if filter_property(full_path):
  File "E:\...\_venv\lib\site-packages\rasa\shared\data.py", line 220, in is_story_file
    return YAMLStoryReader.is_stories_file(
  File "E:\...\_venv\lib\site-packages\rasa\shared\core\training_data\story_reader\yaml_story_reader.py", line 163, in is_stories_file
    return rasa.shared.data.is_likely_yaml_file(file_path) and cls.is_key_in_yaml(
  File "E:\...\_venv\lib\site-packages\rasa\shared\core\training_data\story_reader\yaml_story_reader.py", line 183, in is_key_in_yaml
    return any(
  File "E:\...\_venv\lib\site-packages\rasa\shared\core\training_data\story_reader\yaml_story_reader.py", line 183, in <genexpr>
    return any(
  File "E:\Program Files\Python\Python38\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 662: character maps to <undefined>

I have multiple languages in my NLU files, including Arabic and Armenian which use non-latin characters. There’s also French with accents. The files are UTF-8.

Output of rasa --version:

Rasa Version     : 2.4.0
Rasa SDK Version : 2.4.0
Rasa X Version   : 0.37.1
Python Version   : 3.8.0
Operating System : Windows-10-10.0.19041-SP0
Python Path      : E:\...\_venv\Scripts\python.exe

It seems I’m not the only one with this problem after updating. These people also posted about it and got no replies yet:

I’ll check on Github @ChrisRahme ! Thanks for linking the issue .

2 Likes

Rasa 2.4.2 just released and the problem is fixed.

Thanks @Tobias_Wochinger and the team :heart:

But I have another problem now:

Epochs:   4%|█████▏                                                                                                                    | 6/141 [00:42<15:50,  7.04s/it, t_loss=3.81, i_acc=0.945, e_f1=0.706]
Traceback (most recent call last):
  File "E:\Program Files\Python\Python38\lib\runpy.py", line 192, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "E:\Program Files\Python\Python38\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace\_venv\lib\site-packages\rasa\__main__.py", line 134, in <module>
    main()
  File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace\_venv\lib\site-packages\rasa\__main__.py", line 116, in main
    cmdline_arguments.func(cmdline_arguments)
  File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace\_venv\lib\site-packages\rasa\cli\train.py", line 58, in <lambda>
    train_parser.set_defaults(func=lambda args: train(args, can_exit=True))
  File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace\_venv\lib\site-packages\rasa\cli\train.py", line 90, in train
    training_result = rasa.train(
  File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace\_venv\lib\site-packages\rasa\train.py", line 94, in train
    return rasa.utils.common.run_in_loop(
  File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace\_venv\lib\site-packages\rasa\utils\common.py", line 307, in run_in_loop
    result = loop.run_until_complete(f)
  File "E:\Program Files\Python\Python38\lib\asyncio\base_events.py", line 608, in run_until_complete
    return future.result()
  File "E:\...\_venv\lib\site-packages\rasa\train.py", line 163, in train_async
    return await _train_async_internal(
  File "E:\...\_venv\lib\site-packages\rasa\train.py", line 342, in _train_async_internal
    await _do_training(
  File "E:\...\_venv\lib\site-packages\rasa\train.py", line 388, in _do_training
    model_path = await _train_nlu_with_validated_data(
  File "E:\...\_venv\lib\site-packages\rasa\train.py", line 812, in _train_nlu_with_validated_data
    await rasa.nlu.train(
  File "E:\...\_venv\lib\site-packages\rasa\nlu\train.py", line 115, in train
    interpreter = trainer.train(training_data, **kwargs)
  File "E:\...\_venv\lib\site-packages\rasa\nlu\model.py", line 209, in train
    updates = component.train(working_data, self.config, **context)
  File "E:\...\_venv\lib\site-packages\rasa\nlu\classifiers\diet_classifier.py", line 854, in train
    self.model.fit(
  File "E:\...\_venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "E:\..\_venv\lib\site-packages\rasa\utils\tensorflow\temp_keras_modules.py", line 229, in fit
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "E:\...\_venv\lib\site-packages\tensorflow\python\keras\callbacks.py", line 416, in on_epoch_end
    callback.on_epoch_end(epoch, numpy_logs)
  File "E:\...\_venv\lib\site-packages\rasa\utils\tensorflow\callback.py", line 68, in on_epoch_end
    if self._does_model_improve(logs):
  File "E:\...\_venv\lib\site-packages\rasa\utils\tensorflow\callback.py", line 90, in _does_model_improve
    [
  File "E:\...\_venv\lib\site-packages\rasa\utils\tensorflow\callback.py", line 91, in <listcomp>
    float(current_results[key]) > self.best_metrics_so_far[key]
KeyError: 'val_i_acc'

This does not happen in 2.3.4.

Added a new topic and GitHub issue.