Evaluating a model - clarifications needed

Hi,

I’m trying to evaluate and compare my pipelines, with reference to this doc, but there are a few things that I don’t have quite clear.

The f1-score graph - along with all train/test sets, the trained models, classification and error reports - will be saved into a folder called nlu_comparison_results .

I don’t have any nlu_comparison_results folder. Is it referring to the results folder?

The evaluation script will produce a report, confusion matrix, and confidence histogram for your model.

I have the results.json and nlu_model_comparison_graph files, but no confusion matrix. Also, what does the data in results.json - which is shown below - represent?

{
  "supervised": [
    [
      0.8304962317416327,
      0.7487653955382145,
      0.7090330778607176,
      0.470699607699921
    ]
  ],
  "convert": [
    [
      0.9015842319176504,
      0.8902925258270229,
      0.8530211435642292,
      0.6102203625233347
    ]
  ]
}

You can save these reports as JSON files using the --report argument.

Where should I put the --report argument?

Thank you, Tiziano