How to improve NLU accuracy?

Hi everyone !

I’ve been using RASA for a project since several weeks / months and I still have problems with the accuracy of the NLU.

To be quick the project is to provide a Chatbot factory for specific entities which can use a spreadsheet and we convert this spreadsheet in Rasa files (domain.yml, stories.md and nlu.json).

One entity is doing tests with us and we have a major problem. The NLU, despite several modification on the configuration file, recognize exact sentences with few accuracy (like 50%).

Do you have an idea how to improve this ? We only use RASA for intent recognition (no entities).

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: fr
pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: 'char_wb'
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    entity_recognition: false
    epochs: 20

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
  - name: MemoizationPolicy
    max_history: 1
  - name: TEDPolicy
    max_history: 1
    epochs: 1
  - name: FallbackPolicy
    nlu_threshold: 0.6
    core_threshold: 0.5
    fallback_action_name: 'utter_phrase_hors_sujet_0'

Some data usefull:

We are currently training the model with more than 800 intents and more than 3000 examples (this number is growing).

After the DIETClassifier training and the 20 epochs the accuracy is around 0.98x (which for me is quite good).

Rasa Version : 1.10

Thanks.

I’d say 20 epoch is not enough. The training accuracy that you see is approximate accuracy from subsampled examples. For such a big dataset, I’d try 200-300

Another thing that might help you (this depends on your domain, which I’m assuming is general now) is that you can perhaps add spaCy embeddings. This adds extra features to the pipeline that might help the DIET classifier. I would go for either fr_core_news_md or fr_core_news_lg as described spaCy docs.

I agree with @Ghostvv that it would be best to add more epochs. Especially when you add these features.

Could you share more about these intents? Are they FAQ? If so then the response selector might also help.

Are Epochs used for NLU? I thought they were only used for Stories/rasa_core?

Yes both core and NLU