A company performed fine-tuning but did not achieve the expected results, and decided to review its approach starting from data preparation. Which is the MOST appropriate data preparation for improving the effectiveness of fine-tuning?

1 / 1
Select an answer
CorrectA

Explanation

A question that asks for the appropriate policy for preparing fine-tuning data.

  • 1improving the effectiveness of fine-tuningWant to raise the quality of adaptation
  • 2MOST appropriate data preparationHigh-quality, representative, appropriately labeled = curation
ACorrect

Prepare high-quality, representative data aligned with the task, with appropriate labeling.

Correct. Fine-tuning becomes more effective when you prepare high-quality, representative, appropriately labeled data aligned with the target task. Data curation and governance are important.

BIncorrect

Collect as much data as possible without caring about quality.

Even with large volume, low-quality data can actually reduce performance.

Quality cannot be ignored, so it is incorrect.

CIncorrect

Use random data as is, without labeling.

Fine-tuning is a method that uses labeled data to adapt to a specific task.

Random, unlabeled data cannot achieve effective adaptation, so it is incorrect.

DIncorrect

Mix in as much data unrelated to the task as possible.

Mixing in unrelated data blurs the adaptation to the target task and reduces performance.

Data aligned with the task should be prepared, so it is incorrect.

Key Takeaway

Preparing fine-tuning data is fundamentally about 'preparing high-quality, representative data aligned with the task, with appropriate labeling.' Data curation and governance (managing rights, privacy, and bias) are important. 'Quality does not matter as long as there is volume,' 'unlabeled is fine,' and 'mix in unrelated data' are all errors that reduce performance.