Rasa evalutation does not produce output files

IgNoRaNt23 · August 5, 2019, 11:24am

Rasa Version: rasa==1.1.5 rasa-sdk==1.1.0

Hi, im trying to evaluate my model’s NLU components and found this guide

Unfortunately, almost all additional input flags are ignored.

If I run

rasa test nlu -u evaluate/examples.md -m models/20190805-094203.tar.gz --report evaluate/ --errors ./evaluate/ --histogram ./evaluate/ --confmat ./evaluate/

It produces the following command line output

`2019-08-05 13:19:14 INFO rasa.nlu.components - Added ‘SpacyNLP’ to component cache. Key ‘SpacyNLP-de_core_news_sm’. 2019-08-05 13:19:14 INFO rasa.nlu.training_data.loading - Training data format of ‘/tmp/tmpvz1gfilf/852bb4994431473bbbca3355c4ddd5ad_examples.md’ is ‘md’. 2019-08-05 13:19:14 INFO rasa.nlu.training_data.training_data - Training data stats: - intent examples: 100 (1 distinct intents) - Found intents: ‘answer’ - entity examples: 100 (4 distinct entities) - found entities: ‘house_number’, ‘street’, ‘residence’, ‘zipcode’

2019-08-05 13:19:14 INFO rasa.nlu.test - Running model for predictions: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:01<00:00, 82.34it/s] 2019-08-05 13:19:15 INFO rasa.nlu.test - Entity evaluation results: 2019-08-05 13:19:15 INFO rasa.nlu.test - Evaluation for entity extractor: CRFEntityExtractor /home/local/MGM/hschroeder/.virtualenvs/A12Bot/lib/python3.6/site-packages/sklearn/metrics/classification.py:1145: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. ‘recall’, ‘true’, average, warn_for) /home/local/MGM/hschroeder/.virtualenvs/A12Bot/lib/python3.6/site-packages/sklearn/metrics/classification.py:1145: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no true samples. ‘recall’, ‘true’, average, warn_for) 2019-08-05 13:19:15 INFO rasa.nlu.test - Classification report for ‘CRFEntityExtractor’ saved to ‘evaluate/CRFEntityExtractor_report.json’. 2019-08-05 13:19:15 INFO rasa.nlu.test - Evaluation for entity extractor: CRFEntityServer /home/local/MGM/hschroeder/.virtualenvs/A12Bot/lib/python3.6/site-packages/sklearn/metrics/classification.py:1143: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. ‘precision’, ‘predicted’, average, warn_for) /home/local/MGM/hschroeder/.virtualenvs/A12Bot/lib/python3.6/site-packages/sklearn/metrics/classification.py:1143: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. ‘precision’, ‘predicted’, average, warn_for) /home/local/MGM/hschroeder/.virtualenvs/A12Bot/lib/python3.6/site-packages/sklearn/metrics/classification.py:1143: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples. ‘precision’, ‘predicted’, average, warn_for) 2019-08-05 13:19:15 INFO rasa.nlu.test - Classification report for ‘CRFEntityServer’ saved to ‘evaluate/CRFEntityServer_report.json’.`

but nothing but the report output files are actually generated. And there are no error messages indicating something went wrong while trying to generate these files. Am I missing something?

IgNoRaNt23 · August 5, 2019, 11:39am

Ok, got a little closer to the solution at least. Seems like the confusion matrix and everything else is just meant for intent classifaction, but not for entity extraction.

Any work going in this direction?

MetcalfeTom · August 6, 2019, 10:16pm

Hi there!

The flags --errors --histogram --confmat are all meant to be set to file names, not folders. In fact you don’t need to specify a name if you pass those flags by themselves. Give it a try

linhe · August 21, 2019, 7:01am

Hello,

I think I got the same problem. Only the file CRFEntityExtractor_report.json is being created.

@MetcalfeTom When passing the flags by themselves, I get one of the following:
rasa test nlu: error: argument --errors: expected one argument
rasa test nlu: error: argument --histogram: expected one argument
rasa test nlu: error: argument --confmat: expected one argument
depending on which one is first in line.

Further information: at the moment I have one intent and 4 entities.

I think @IgNoRaNt23 is right and these aren’t available for entity extraction but only for intent classification?

Is there a way to get this working for entity extraction or is there something else like a confusion matrix to see the entities which are being confused with one another?

IgNoRaNt23 · August 21, 2019, 1:50pm

Afaik not. My team is currently working on a solution for the error.json for entities, that I expect at any moment. We already talked about creating a PullRequest for rasa in the future, but this might take a while if it happens at all.

erohmensing · September 2, 2019, 8:18am

Hey @linhe, @IgNoRaNt23, success/failure reporting for NER was recently merged into master and will release with rasa 1.3. Feel free to check it out Report successful and incorrect predictions of NER by tabergma · Pull Request #4335 · RasaHQ/rasa · GitHub

Topic		Replies	Views
Rasa test not showing results Rasa Open Source	16	1618	February 27, 2020
Rasa_nlu evalution error Rasa Open Source	1	760	August 24, 2018
Error when testing Rasa NLU model Rasa Open Source	6	1926	June 21, 2021
[Solved] Evaluation of entity extraction Rasa Open Source	3	1069	October 15, 2018
How to evaluate slot-filling? Rasa Open Source testing	0	341	September 26, 2022

Rasa evalutation does not produce output files

Related topics