How works N-gram based model

narendraprasath · July 29, 2019, 5:20am

I have build RASA NLU by following intents,

intent:greet

hi

hello

hey

intent:ask_how_doing

how are you

are you okay

how are you doing

NLU Pipeline Configuration

pipeline:
- name: "WhitespaceTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
  analyzer: "word"
  stop_words: { "english" }
  min_ngram: 1
  max_ngram: 3
- name: "EmbeddingIntentClassifier"

My question is, When I am testing a sentence like below

are you how
you doing are how

RASA NLU, still getting the intent as ask_how_doing with 91% accuracy and even when we give an proper order of sentence the accuracy is 97%. I feel that the tri-gram model is not given proper importance when sequence is missed because the sequence of words are taken from 1 to 3.

But if you try the sequence from 2 to 3. like below configuration,

min_ngram: 2

max_ngram: 3

The above sequence problem was solved but individual words like “hi”, “hello” and “hey” not able to get the intent because individual words are not trained in the model.

So, How do i tackle these kind of problem in RASA NLU pipeline?

Topic		Replies	Views
Pipeline Behaviour Rasa Open Source	1	512	October 14, 2019
Config File for Indic language Rasa Open Source	0	156	October 3, 2023
NLU gets one-word entity right, misses extraction Rasa Open Source	2	315	October 20, 2020
How to improve NLU accuracy? Rasa Open Source	4	2114	April 26, 2021
Improve Rasa NLU model Rasa Open Source	5	2168	October 15, 2019

How works N-gram based model

intent:greet

intent:ask_how_doing

Related topics