Hi,
I am using Rasa 2.2.5 to build a chatbot. In the training dataset, I used the lookup table feature to covert an entity (‘cities’) with 100 values. I’ve generated some examples with a few values out of the 100 values in an intent ‘FindCityIntent’. I also created an intent (‘outofscopeIntent’) to handle utterances that containing works that are out-of-vocabulary. Here is what the outofscopeIntent looks like with listing out only a few examples.
intent: outofscopeIntent examples: - oov oov oov - oov - what is oov
But when I enter queries with city names that are in the lookup table but not in the training dataset, the model would predict them all to be the outofscopeIntent, but what I expected is “FindCityIntent”.
It looks like the model treats the city names as oov and thus predict it to outofscopeIntent. Is there a way to fix this problem? Here is my pipeline:
3 language: zh 4 pipeline: 5 - name: JiebaTokenizer 6 dictionary_path: './data/jieba_dict/' 7 "intent_tokenization_flag": False 8 - name: RegexFeaturizer 9 case_sensitive: False 10 use_word_boundaries: False 12 - name: CountVectorsFeaturizer 13 - name: CountVectorsFeaturizer 14 analyzer: "word" 15 min_ngram: 1 16 max_ngram: 3 17 max_df: 6 18 OOV_token: "oov" 19 stop_words: english 20 - name: RegexEntityExtractor 22 case_sensitive: False 24 use_lookup_tables: True 26 use_regexes: True 28 "use_word_boundaries": False 29 - name: "CRFEntityExtractor" 31 BILOU_flag: True 32 - name: DIETClassifier 33 epochs: 100 34 entity_recognition: False 35 random_seed: 666 36 - name: EntitySynonymMapper 37 - name: ResponseSelector 38 epochs: 100