Does Lookup Tabel works in Chinese

I have tried to use Look up Table in my project .I follow the instruction of the demo in improvin-entity-extraction .It works well where I use nlp_spacy , intent_entity_featurizer_regex,and ner_crf.But in my case ,I need to deal with Chinese characters.So I change the pipline to `language: “zh”


  • name: “nlp_mitie” model: “total_word_feature_extractor_zh.dat”
  • name: “tokenizer_jieba”
  • name: “ner_synonyms”
  • name: “intent_entity_featurizer_regex”
  • name: “ner_crf”
  • name: “intent_featurizer_mitie”
  • name: “intent_classifier_sklearn”` And use it to train my Chinese Data. But the “Look up Table” seems does not work in this case.So any suggestion about my problem .

I have tired to use Chinese in rasa with lookup table ,here is my pipeline language: “zh” pipeline:

  • name: “JiebaTokenizer” dictionary_path: “./jieba_userdict/dict.txt”
  • name: “RegexFeaturizer”
  • name: “CRFEntityExtractor”
  • name: “EntitySynonymMapper”
  • name: “CountVectorsFeaturizer”
  • name: “EmbeddingIntentClassifier”

Thanks for your reply .I have tried your pipline ,but it seems still can’t extract the entity in my lookup.txt.can you please send me your trainning data and dict.txtn by email do you have a wechat ,my wechat id is sleeping__bear.Thank you very much

jieba_userdict is unnecessary easy sample of data/

## intent:inform_info
- [不存在的女兒](book_name)
- [撒哈拉歲月](book_name)
- [解答之書](book_name)

## lookup:book_name

lookup sample lookup_tables/book_name.txt


with pipline above,you can try to key 白夜行 ,rasa nlu will parse intent is inform_info and 白夜行 is a entity of book_name

Thanks for reply. You really help me a lot. It’s ok to run with your data . My training data was not tokenized well and I finally figure it out by impove my training data .I add some special words in my userDict and add more training data to help rasa recognize it .The problem is always the data not the pipeline .I was in the wrong direction.