An ML team is defining quality control rules for model development. In machine learning, which of the following BEST describes the main purpose of splitting data into training (train), validation, and test sets?

1 / 1
Select an answer
CorrectD

Explanation

Select the purpose of the train/validation/test split.

  • 1main purpose of splitting data into training (train), validation, and test setsTo fairly evaluate generalization performance on data not used in training
AIncorrect

To correct imbalance in the number of records per class

Correcting class imbalance is a separate effort done with sampling or weighting.

The purpose of the three-way split is to fairly measure generalization performance on data not used in training, so it is incorrect.

BIncorrect

To make the training data and test data identical to raise accuracy

Making training and test identical evaluates rote memorization and overstates the result.

This is the exact opposite of the goal of fair evaluation, so it is incorrect.

CIncorrect

To repeat training three times on the same data to raise accuracy

Splitting into three is to divide roles, not to repeat the same training.

Only train is used for training; validation and test are exclusively for evaluation, so it is incorrect.

DCorrect

To fairly estimate generalization performance on unseen data without bias

This is correct. Splitting data into three is to give each a role and fairly measure generalization performance.

- Training data (train): data shown to the model repeatedly to learn its parameters (weights).

- Validation data (validation): data used to test the in-progress model for hyperparameter tuning, model selection, and checking overfitting (not used for training itself).

- Test data (test): data used only once after everything is decided, to measure final generalization performance on unseen data without bias.

Evaluating on data not used in training avoids overstated results due to rote memorization.

Key Takeaway

The main purpose of splitting data into train/validation/test is to 'fairly estimate generalization performance on unseen data without bias.' Evaluating on data not used in training prevents overstated results due to rote memorization. 'Reducing the number of records,' 'making training and test identical,' and 'raising response speed' are not the purpose; in particular, train = test is the exact opposite of fair evaluation.