The chatbot is not identifying names (proper nouns) except for those that I’ve specified in the stories. How can I enable the chatbot to also understand other proper nouns, domain specific names (like product names), numbers and date-times?
there are several ways to achieve this. Defining custom entities inside the training data is one of them.
Actually, Rasa uses the
CRFEntityExtractor to extract them.
You might want to consider extending your pipeline with the following elements:
- name: DucklingHTTPExtractor dimensions: - time - duration locale: de_DE timezone: Europe/Berlin url: http://localhost:8001 - name: SpacyEntityExtractor dimensions: ["PERSON", "LOC", "ORG", "PRODUCT"]
This enables Rasa to fetch more information using predfined entities. If you want to read about the possibilities of duckling, click here, likewise for spacy.
Did that help?
Thank you @JulianGerhard This helped.
However, now it is recognizing only a limited set of names (probably pre-learned) which is not enough. What would I have to do for it to start recognizing more names of people, products and domain based products and other domain related entity values?
FYI, now my pipeline in config.yml looks like this:
pipeline: - name: "WhitespaceTokenizer" - name: "RegexFeaturizer" - name: "CRFEntityExtractor" - name: "EntitySynonymMapper" - name: "CountVectorsFeaturizer" - name: "EmbeddingIntentClassifier" - name: "SpacyNLP" - name: "DucklingHTTPExtractor" url: "http://localhost:8000" dimensions: ["time", "number", "amount-of-money", "distance"] - name: "SpacyEntityExtractor" dimensions: ["PERSON", "LOC", "ORG", "PRODUCT"]
glad it helped. To follow your expectations we need to clarify something:
Do you “simply” want more different predefined entities or do you want to extend the functionality of the extractors such that they are able to extract more different permutations of their entity?
Consider that being allowed to use common-entity-extractors like duckling, spacy and others alongside to be able to use the CRFEntityExtractor to train entities for yourself and in addition to use some variations like @amn41 proposed here (meta entities) or @BeWe11 proposed here (composite entities) or @naoko proposed here (regex entities) is really powerful and should enable you to extract almost everything needed.
So it would be really helpful to explain us what you are up to!
Thank you @JulianGerhard. Let me go through the links that you’ve provided. But meanwhile please find below my explanation to your question and to elaborate on what I’m trying to achieve.
Right now, with the above configuration, if I try “I am Rocky” it detects “Rocky” as an entity and PERSON but if I try a sort of lesser common name like for example “I am Raj” it does not detect any entity. I was expecting that it detects “Raj” as PERSON entity. But it didn’t.
Similarly, I wanted it to detect products more like common nouns “laptop” it doesn’t detect it as an entity. Whereas I wanted it to detect it as well.
Then I wanted it to detect domain level products such as like proper nouns of products “Golden Previlege Cards” or “Acme firecrackers”, etc.
More domain level phrases or texts such as “3GB RAM”, “Savings account”, “Red” color, etc.
How can I achieve these 4 things?
Hi @KnightCoder @JulianGerhard . Can you please clarify if I need the bot to capture names, do I need to make the changes only in the config.yml file or do I need to make changes in other files of rasa too(like in action.py or nlu file…).