Questions of Rasa with Spacy

  1. How to specify user dictionary of spacy for non-English language, you know the performance of tokenizer will affect the whole nlp process, such as entity extraction. for example, we specify it for Chinese language
nlp.tokenizer.pkuseg_update_user_dict(['yyds','cx-4'])
  1. How to specify custom entities in spacy for rasa.
    import spacy
    
    nlp = spacy.load('zh_core_web_sm')
    nlp.tokenizer.pkuseg_update_user_dict(['yyds', 'cx-4'])
    ruler = nlp.add_pipe("entity_ruler")
    patterns = [
        {"label": "net_hot_word", "pattern": "yyds"},
        {"label": "car_name", "pattern": "cx-4"}
    ]
    
    ruler.add_patterns(patterns)