Rasa nlu for long question

Hello everybody. I am a beginner in rasa technologies.

I use rasa nlu and when the question is long (more than 8-10 words) the intent recognition is pretty bad.

Do you know how I can optimize this ?

Thank you for your time, any help would be much appreciated.

Hi @ubil,

how much training data you have?

How is your pipeline configured in your config.yml ?

Did you read the Rasa NLU in Depth Blog?

You gave us very less information, it´s hard to help you.

Hi @lindig,

thank you for your time, I am sorry for the lack of informations I gave

I have 15 intents to train with 5-6 questions by intent.

An example of one intent :

## intent:spa

- Où est le spa ?
- Comment je vais au spa ?
- Je veux aller au spa
- Le spa
- Y a t il un spa ?
- Avez-vous un espace Spa ?

My file config.yml looks

language: "fr"  # your two-letter language code

pipeline:
  - name: "SpacyNLP"
    # language model to load
    model: "fr_core_news_md"
    # when retrieving word vectors, this will decide if the casing
    # of the word is relevant. E.g. `hello` and `Hello` will
    # retrieve the same vector, if set to `false`. For some
    # applications and models it makes sense to differentiate
    # between these two words, therefore setting this to `true`.
    case_sensitive: False
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "word"
    token_pattern: r'(?u)\b\w\w+\b'
    # remove accents during the preprocessing step
    strip_accents: None  # {'ascii', 'unicode', None}
    # list of stop words
    stop_words: {'french'}  # string {'english'}, list, or None (default)
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

@ubil, 5-6 examples per intent is very low. You could try to add more training data so your model has a bigger base to train.

Also be careful with your intents. Do some of them have the same meaning? For NLU-models it´s sometimes hard to distinguish from especially if you have very low data samples.

I don´t know your exact use case, neither I speak French. A good approach is to have a look at the Sara Bot on GitHub.

1 Like

Ok @lindig, thank you for your answers, I will try it.

No they are distinct intents.

I will see it, thank you again !