Unable to train a model with training data that contains non ascii characters

I got the error message after I post a request to my rasa server

{ “error”: “ascii” }

My Training data

language: zh

pipeline:

  • name: tokenizer_whitespace
  • name: intent_entity_featurizer_regex
  • name: ner_crf
  • name: ner_synonyms
  • name: intent_featurizer_count_vectors
  • name: intent_classifier_tensorflow_embedding data: |

intent:wifi_not_working

  • 我 的 電腦 WIFI 打不開
  • 我 的 WIFI 不能用
  • 我 的 WIFI 有問題
  • 我 不能 用 WIFI

intent:find_repair_center

I found if I remove round brackets “(” and “)”, everything works fine. I really need them to let Rasa know what entities are, so I cannot just remove. Anyone has idea to fix this problem?

maybe you need this one: GitHub - crownpku/Rasa_NLU_Chi: Turn Chinese natural language into structured data 中文自然语言理解