Verify the rights and licenses for the data.
This is correct. Confirming that the data to be used for customization is permitted for that purpose under copyright and license terms is an important data governance perspective for avoiding legal risk.
A company is preparing to fine-tune a foundation model for internal use. Which TWO of the following represent appropriate data governance perspectives to follow regarding the data to be used for training? Principles take priority over efficiency. (Choose TWO.)
Select two governance perspectives to follow for data used in customization.
Verify the rights and licenses for the data.
This is correct. Confirming that the data to be used for customization is permitted for that purpose under copyright and license terms is an important data governance perspective for avoiding legal risk.
Properly protect and anonymize personal information.
This is correct. When the data used for customization contains personal information, protecting and anonymizing it appropriately is an important data governance perspective for privacy and regulatory compliance.
Use data of unknown origin to increase the volume of data.
This is incorrect. Data of unknown origin may introduce rights infringement, quality, and bias problems. Verifying origin and rights is the fundamental principle of governance.
Remove all metadata to standardize format.
This is incorrect. Metadata such as source and date is information necessary for tracing provenance and conducting audits. Deleting it makes it impossible to fulfill accountability obligations.
Consolidate all data into a public bucket for training efficiency.
This is incorrect. Consolidating data into publicly accessible storage directly leads to leakage of confidential data. Access control and protection are the fundamental principles of governance.
For data used to customize foundation models, 'verifying data rights and licenses (whether use is permitted)' and 'protecting and anonymizing personal information (privacy)' are the data governance perspectives to follow (others include data quality and provenance management). These prevent legal risk and privacy violations.