A company wants to automatically extract not only text but also tables and form fields as structured data from scanned document images such as invoices and application forms. Which AWS service is MOST suitable for this requirement?

1 / 1
Select an answer
CorrectD

Explanation

A question asking to select the AI service that extracts data from document images.

  • 1from scanned document imagesScanned documents or PDFs as input = Textract
  • 2tables and form fieldsStructural extraction beyond simple OCR
  • 3automatically extractAutomates document processing = Textract
  • 4structured dataOutput as structured data = Textract
AIncorrect

Amazon Rekognition

Amazon Rekognition is an image analysis service that detects objects, scenes, faces, and similar elements from images and videos, with limited text detection capability.

However, the document-specific function of extracting tables and form fields as structured data from invoices and forms is the role of Textract, so this is incorrect.

BIncorrect

Amazon Transcribe

Amazon Transcribe is a speech recognition service that transcribes audio into text.

Although its name resembles Textract, its input is audio or recorded data, and it cannot fulfill this requirement of extracting tables and forms from scanned document images, so this is incorrect.

CIncorrect

Amazon Comprehend

Amazon Comprehend is an NLP service that extracts sentiment and entities from text.

The OCR role of reading characters and tables from document images themselves belongs to Textract, so this is incorrect.

DCorrect

Amazon Textract

This is correct. Amazon Textract is an AI service that automatically extracts text (OCR), tables, and form fields as structured data from document images and PDFs. It is used to automate the processing of invoices, application forms, and business documents.

Key Takeaway

'Extract text, tables, and forms from document images' points to Amazon Textract (OCR + structural extraction). Image object/face detection is Rekognition. The similarly named Transcribe handles speech-to-text. Distinguish by 'what is extracted from what.'