Cross validation splits in sklearn intent classifier .py

Hello @tmbo ,

I am looking in the code for this function of _num_cv_splits

def _num_cv_splits(self, y): folds = self.component_config[“max_cross_validation_folds”] return max(2, min(folds, np.min(np.bincount(y)) // 5))

Fitting 2 folds for each of 6 candidates, totaling 12 fits

Why we are restricting the cv_splits value to 2?

Well, we are not restricting it to two, but 2 is the minimum number of splits we need to choose. If we e.g. choose 1 the training data splitter and the evaluation algorithm would complain as there would be no test set. That is why the minimum of parts we split the data into is two.

@tmbo, I figured out the reason for getting num_of_splits 2. Most of the times in my text classification training samples, my np.bincount(y) array has a minimum value <=10, so np. min return this and divides by 5 results the overall splits for this training data to 2.

Thank you.