An ML team validated a demand forecasting model it had built and found that it achieves very high accuracy on the training data but its accuracy drops significantly on new, unseen data. What is this state called?

1 / 1
Select an answer
CorrectD

Explanation

Choosing the name of the state with high accuracy on training and low accuracy on new data.

  • 1very high accuracy on the training dataFits too closely to the training data
  • 2accuracy drops significantly on new, unseen dataCannot generalize = overfitting
AIncorrect

Underfitting

Underfitting is a state where accuracy is low even on the training data.

It differs from the situation in this question, where accuracy is high on the training data, so it is incorrect.

BIncorrect

Data drift

Data drift is a phenomenon where accuracy drops because the distribution of input data shifts from the training-time distribution during operation.

This question is already weak on unseen data at validation right after training, and the cause is not a distribution change over time, so it is incorrect.

CIncorrect

Data bias

Data bias is a problem where a skew in the training data appears as unfair predictions for certain groups.

This question is a failure of generalization due to overfitting to the training data, not a fairness problem, so it is incorrect.

DCorrect

Overfitting

Correct. Overfitting is a state where the model fits the training data excessively, memorizing noise and fine details, with the result that its generalization performance on unseen data drops.

Key Takeaway

Remember the correct answer, 'overfitting'.
・A state where the model fits the training data excessively and memorizes noise and fine details.
・Even with high accuracy on the training data, accuracy drops on unseen data and the model cannot generalize.
Underfitting is a different state where accuracy is low even on the training data, and regularization (a curbing technique) and hyperparameters (configuration values) are not names of 'states' in the first place.