Hallucination
Hallucination is the problem in which generative AI plausibly outputs content not grounded in fact.
It is a problem of model output quality, not the reproduction of training-data bias in predictions, so it is incorrect.
In a company review, a model trained on biased historical hiring data was found to reproduce the same bias. What is the name of the problem in which bias contained in the training data appears directly as bias in the model's predictions?
Choosing the name of bias that originates from the training data.
Hallucination
Hallucination is the problem in which generative AI plausibly outputs content not grounded in fact.
It is a problem of model output quality, not the reproduction of training-data bias in predictions, so it is incorrect.
Overfitting
Overfitting is the problem in which a model fits the training data excessively and loses accuracy on unseen data.
It is a problem of generalization in training, not bias in the data appearing as bias in predictions, so it is incorrect.
Data drift
Data drift is the problem in which, during operation, the distribution of input data gradually shifts from training time.
It is a problem of change over time, not the reproduction of bias that was in the training data from the start, so it is incorrect.
Data bias
Correct. Data bias is the problem in which bias contained in the training data appears directly as bias in the model's predictions (overrepresentation or underrepresentation of certain attributes, historical prejudice, and so on).
Remember the correct answer, 'data bias,' with concrete examples.
・Data bias is the problem in which bias contained in the training data appears directly as bias in the model's predictions.
・Examples: training on data where past hiring was skewed toward men causes the model to rate men more highly; if data for a certain region or age group is scarce, only that group's prediction accuracy drops.
・It is a major cause of impaired fairness, addressed by ensuring representative data and bias detection (with tools such as SageMaker Clarify).