I am trying to make rasa nlu to identify entities and my pipeline is below
pipeline:
name: “nlp_spacy”
name: “tokenizer_spacy”
name: “intent_featurizer_spacy”
name: “intent_classifier_sklearn”
name: “ner_crf”
name: “ner_synonyms”
I have given sufficient examples to identify chicken and pizza as an entity called as ‘dish’.Now my intention was that if i just define an entity lets say burger as a ‘dish’,include it my data.json file and train the same (without giving all the exhaustive examples for chicken and pizza) it should start picking up the burger in the same context of chicken and pizza.
For eg - If i say that - i am looking for an Italian burger recipe,
it should identify burger as an entity called as dish.But currently it is not doing the same.
How can it be achieved so that the chat bot generalizes on entities ? Do we need any tuning in using ner_crf ?
Update - If i give some example data for burger - lets say i gave around 3 examples identifying burger as entity ‘dish’ then it seems to generalize well for burger on other kind of sample conversations.
But my issue still remain - Do i need to provide some some sample conversations for each entity ‘dish’ ? I wanted to it to generalize based on just defining any entity as ‘dish’ and expect it to fit to already trained data where context of entity ‘dish’ is being used.
For eg -
I am looking for spicy burger recipe
It should understand that burger is a ‘dish’ because we have trained on similar example for chicken and we have defined burger as a ‘dish’.
While Chatito can solve the data generation problem, I still think that the original question stands. Must we provide training data for entity extraction with all the different combinations? Is it possible for some component to just pick up new values? Is it based of the word length? For example, if I have sufficient training data for word burger (6 characters) will mutton at least be recognized without extra training data?
Thanks …yeah the issue is not with data generation but with respect to making an entity (i.e. ‘dish’ here) to generalize and making it to work in contexts where the other ‘dish’ values have been trained on.
ner_crf for entity recognition generate weight based on features such as suffix, prefix, word before and after token. When we provide sufficient enough data for entity recognition using CRF, it will be generalize enough to identify new entity value, like in this case. If you wanna try trainig ner_crf alone you can try sklearn_crf, on there you can tuning parameter like L1 and L2 regularization to best fit model for your data and your purpose.
I recently started using lookup tables. However, I suddenly start getting ill defined f scores from sklearn intent classifier if my lookup table is large. What could be the reason? I have opened a discussion over here.