thank you for your time, I am sorry for the lack of informations I gave
I have 15 intents to train with 5-6 questions by intent.
An example of one intent :
## intent:spa
- Où est le spa ?
- Comment je vais au spa ?
- Je veux aller au spa
- Le spa
- Y a t il un spa ?
- Avez-vous un espace Spa ?
My file config.yml looks
language: "fr" # your two-letter language code
pipeline:
- name: "SpacyNLP"
# language model to load
model: "fr_core_news_md"
# when retrieving word vectors, this will decide if the casing
# of the word is relevant. E.g. `hello` and `Hello` will
# retrieve the same vector, if set to `false`. For some
# applications and models it makes sense to differentiate
# between these two words, therefore setting this to `true`.
case_sensitive: False
- name: SpacyTokenizer
- name: SpacyFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: "word"
token_pattern: r'(?u)\b\w\w+\b'
# remove accents during the preprocessing step
strip_accents: None # {'ascii', 'unicode', None}
# list of stop words
stop_words: {'french'} # string {'english'}, list, or None (default)
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 100
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
@ubil, 5-6 examples per intent is very low. You could try to add more training data so your model has a bigger base to train.
Also be careful with your intents. Do some of them have the same meaning? For NLU-models it´s sometimes hard to distinguish from especially if you have very low data samples.
I don´t know your exact use case, neither I speak French. A good approach is to have a look at the Sara Bot on GitHub.