Rasa-NLU not extracting entities

Hi All.

I am new to Rasa so have decided to build a ‘weatherbot’ from scratch as a learning exercise, but I have run into a problem with Rasa-NLU, which I could use suggestions for from somebody who has greater (Rasa) experience than I do.

My problem is that rasa_nlu does not extract entities correctly from input statements for some reason, when I try to test what it is doing. If I ask my little bot ‘what is the weather in Paris’ using interpreter.parse() in a jupyter notebook., then ‘Paris’ is not recognised as an entity at all - the output dump contains the line “entities”:[ ] showing that it was not extracted as an entity …

It should be, of course, but being the first time that I have ‘played’ with the Rasa stack, I dont have the experience to know why it is not, so any suggestions are welcomed. I can find nothing relevant either in the Rasa docs, nor online. Of course, since the first element in the chain (rasa_nlu) is not working properly, I cannot even test the rest of the chain.

I am using rasa_nlu 0.14.6 and rasa_core 0.13.8, along with spaCy and sklearn, and they have all loaded properly. I also have Anaconda and the usual Python DS stack installed, as well as TensorFlow and the underlying CUDA (etc) stack - this is a machine that I regularly use for (other) ML and analytics purposes, so I know that all of these are working correctly too.

I have created an MD NLU training data file, and needless to say, Paris is one of the named entities in my training file; when I run the trainer.train() function against it, the output data stats tells me that the trainer has found 136 intent examples with 10 distinct intents, and 52 entity examples with 2 distinct entities, ‘location’ and (user) ‘name’. So, they seem to be picked up ok at this stage.

There is also another rasa-nlu oddity with intents too - since the exact phrase being in the training file, but the confidence level returned by the interpreter.parse() call is only 0.4763, and not >>90% as I would have expected it to be. Looks like the classification is not as good as I would have expected it to be, and for unknown reasons…

Hello @rickturner646,

What does your config.yml file look like? Do you have the entity extractor of your choice added? It should look something like this for example:

language: en

pipeline:
  - name: "SpacyNLP"
  - name: "SpacyTokenizer"
  - name: "SpacyFeaturizer"
  - name: "SklearnIntentClassifier"
  - name: "CRFEntityExtractor"
  - name: "EntitySynonymMapper"

policies:
  - name: MappingPolicy
  - name: MemoizationPolicy