Questions of Rasa with Spacy

  1. How to specify user dictionary of spacy for non-English language, you know the performance of tokenizer will affect the whole nlp process, such as entity extraction. for example, we specify it for Chinese language
  1. How to specify custom entities in spacy for rasa.
    import spacy
    nlp = spacy.load('zh_core_web_sm')
    nlp.tokenizer.pkuseg_update_user_dict(['yyds', 'cx-4'])
    ruler = nlp.add_pipe("entity_ruler")
    patterns = [
        {"label": "net_hot_word", "pattern": "yyds"},
        {"label": "car_name", "pattern": "cx-4"}