Hi all,
I have created 3 different pipelines and ran a cross-validation test with different train/test splits. The results.json is as shown below,
{
"configP1": [
[
0.4071428571428571,
0.4705215419501133,
0.28843537414965986,
0.36984126984126986
],
[
0.7065759637188209,
0.6936507936507937,
0.5833333333333334,
0.4531746031746031
],
[
0.4698412698412698,
0.5907029478458049,
0.4222222222222222,
0.35411255411255416
]
],
"configP2": [
[
0.5226757369614512,
0.6428571428571429,
0.47551020408163264,
0.3857142857142857
],
[
0.6468253968253969,
0.7700680272108843,
0.4272108843537415,
0.5272108843537415
],
[
0.5700680272108843,
0.5335600907029479,
0.4015873015873016,
0.1795918367346939
]
],
"configP3": [
[
0.5056689342403629,
0.6026077097505669,
0.4391156462585034,
0.4368789940218511
],
[
0.6385487528344671,
0.7732426303854877,
0.5523809523809524,
0.5253968253968254
],
[
0.5002267573696145,
0.5986394557823128,
0.35827664399092973,
0.23333333333333334
]
]
}
Could someone explain what these reported values for each configuration are? I looked through the documents but did not find any details of what the values are.
Also the average f1-score graph is as shown, If I understand right, if the f1 score doesn’t not increase further, implies that the training data is sufficient. Can someone help me understand why there is a dip after a certain point.