A company is estimating the development cost of a foundation model (FM). In the FM lifecycle, which is the MOST computationally expensive stage, in which the model acquires general knowledge of language and the world from a large amount of unlabeled data?

1 / 1
Select an answer
CorrectD

Explanation

Choose the large-scale training stage that acquires general knowledge.

  • 1a large amount of unlabeled dataTrains at large scale without labels
  • 2acquires general knowledgeCreates the foundation = pretraining
  • 3MOST computationally expensive stageA characteristic of pretraining, which requires enormous computation
AIncorrect

Fine-tuning

Fine-tuning is the stage that adapts a pretrained model to a specific task with a relatively small amount of labeled data, and its computational cost is far smaller than pretraining.

It is not the stage that acquires general knowledge from scratch, so this is incorrect.

BIncorrect

Evaluation

Evaluation is the stage that measures the performance of a trained model using test data and similar means.

It is not the large-scale training stage in which the model acquires general knowledge, so this is incorrect.

CIncorrect

Deployment

Deployment is the stage that rolls out the finished model to a production environment.

It is not the large-scale training stage in which the model acquires general knowledge, so this is incorrect.

DCorrect

Pretraining

Correct. Pretraining is the stage that lets the model acquire general knowledge from a large amount of unlabeled data, and it incurs enormous computational cost. The foundation model created here is later adapted to each task.

Key Takeaway

Remember the role of the correct answer, pretraining.
・The MOST computationally expensive stage, in which the model acquires general knowledge from a large amount of unlabeled data.
・The foundation model created here is later adapted to each task through fine-tuning and similar means.
Fine-tuning (adapting to a specific task with little data), evaluation, and deployment are later stages with smaller computational cost.