I’m about to working on Bots that support for both English and Japanese, along the way I face the problems with Japanese Text. I now use “tensorflow” as pipeline. I run everything as default then Bots seem to not working fine. I read the topics around and I came to know that Japanese needed special case. Default tokenization is not working, I need something like “MeCab”.
I try to look around but cannot see a way how to really put them together.
Anyone could guide me a bit in this is really appreciate ??
Thank you @akelad for your response.
I have followed the jieba_tokenizer, and I just eliminated some parts which I consider it not exist in Mecab, example Dictionary.
I rantraining data as normal, I got it successful trained, and I tried to do prediction but It seem that to be false. All the prediction is return as “None”.
I think there is some problem with Text Tranformation, but cannot fix it out yet.
I link complete code, here, maybe you could try it out.
I think I tokenization is the problem but I already try my best.
Hope to fix it out here, I really want RASA to work with Japanese Text