Entity not extracted, if particular value not used in the training data


I have an intent, with one entity. The entity itself has around 20 possible values and each value has approximately 20 synonyms. Using tensorflow_embedding, I trained the model on ~500 examples and it identifies intent/entity with high confidence if it’s seen the value of the entity in the examples, but otherwise, it misses. Example (below is entity/value/synonyms hierarchy)

- auto_and_transport
    -- auto and transport
    -- auto
    -- transport
- food_and_groceries
    -- food
    -- groceries
    -- food and groceries

If I provide at least one example with each entity value (not necessarily with every synonym), everything works, but if I provide examples with only auto_and_transport values, but not with food_and_groceries, than rasa does not extract correct entity value from the user input, when it has not seen it in the example. Do I miss something?

Have you tried removing the low parameter from the ner_crf config? From the array in the middle

Hi @znat according to documentation, the default config looks like this:

features: [["low", "title"], ["bias", "suffix3"], ["upper", "pos", "pos2"]]

so I suppose it’s already removed, right?

OK, so turned out I had a bug in my training file. Everything works like a charm.