RASA with tensorflow pipeline detecting entities that were not in the training data

rasa-nlu
(Varun Jain) #1

Rasa version : 0.1.1

Python version : 3.7

Operating system (windows, osx, …): osx

Issue :

I am making an application that can pick entities from a grocery list. For the first version I am trying only with a couple of products/brands.

My training data has been generated using chatito and these are the query shapes:

%[inventory_count]('training': '0.99', 'testing': '')
    order @[count] @[units?] of @[brand?] @[product]
    @[count] @[units?] @[product] by @[brand]
    @[brand] @[product] is @[count] @[units?]
    add @[brand] @[product] @[count] @[units?]
    @[brand] @[product] @[count] @[units?] on floor

The product list is something like this:

spinach 5 oz
spinach 5 ounces
spinach 16 oz
spinach 16 ounces
spinach
spinach sixteen ounces
spinach five ounces
gala apples
gala apple
apples
apple

The trained model is able to detect all the products in the list above but it is also detecting arbitrary products like this (there can be potential errors in the text and I wanted to test for the negative case):

spinach 50 oz
call apples

Is this overfitting? What are the possible solutions?

Content of configuration file (config.yml) :

language: "en"

pipeline:
- name: "tokenizer_whitespace"
- name: "ner_crf"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
  droprate: 0.5
  epochs: 300
  C2: 0.02
0 Likes