Hello,
I work on the project which uses Rasa NLU. I have nlu_data file with 1000 intents and about 8 samples per intent. Is my model over fitting if train metrics=1.000?
My cross validation evaluation results for folds=10:
- CV evaluation (n=10)
- Intent evaluation results
- train Accuracy: 1.000 (0.000)
- train F1-score: 1.000 (0.000)
- train Precision: 1.000 (0.000)
- test Accuracy: 0.905 (0.027)
- test F1-score: 0.883 (0.033)
- test Precision: 0.874 (0.037)
My cross validation evaluation results for folds=5:
- CV evaluation (n=5)
- Intent evaluation results
- train Accuracy: 1.000 (0.000)
- train F1-score: 1.000 (0.000)
- train Precision: 1.000 (0.000)
- test Accuracy: 0.886 (0.017)
- test F1-score: 0.871 (0.017)
- test Precision: 0.885 (0.016)
Nlu_config pipeline:
pipeline:
- name: “tokenizer_whitespace”
- name: “intent_featurizer_count_vectors”
- name: “intent_classifier_tensorflow_embedding” intent_tokenization_flag: true