Identify the deployment method that provides rea…

A company wants to deploy a trained model for production use as low-latency real-time inference while minimizing the overhead of building and operating servers themselves. Which deployment approach BEST meets this requirement?

1 / 1

Select an answer

CorrectC

Explanation

Question Overview

Identify the deployment method that provides real-time inference with low operational overhead.

Requirements to satisfy

1「minimizing the overhead of building and operating servers themselves」Delegate to a managed service
2「low-latency real-time inference」Online inference = SageMaker real-time endpoint

Per-option explanation

AIncorrect

Build and operate an inference server on EC2 independently.

This is incorrect. Building on EC2 can achieve real-time inference, but the company must build, patch, and scale the server itself. This does not meet the requirement of 'minimizing operational overhead.'

BIncorrect

Pre-compute inference results and store them in S3.

This is incorrect. Pre-computed result delivery is a technique for limited input patterns, but it cannot return predictions on the fly for unknown inputs. It does not meet the real-time inference requirement.

CCorrect

Deploy to a SageMaker real-time inference endpoint.

This is correct. SageMaker real-time inference endpoints delegate infrastructure building and operations to a managed service while providing low-latency online inference as an API. They reduce operational overhead.

DIncorrect

Run a SageMaker batch transform job on a scheduled basis.

This is incorrect. Batch transform is managed and has low operational burden, but it processes data in bulk offline. It does not meet the real-time inference requirement of returning an immediate response.

Key Takeaway

The option that satisfies BOTH 'minimizing operational overhead' AND 'low-latency real-time' is the SageMaker real-time inference endpoint. Self-built EC2 (real-time is possible but adds operational burden) and batch transform (managed but offline) each satisfy only one of the requirements — cutting them with the requirements is the key.

Explanation

💡Key Takeaway

Related Links

Key Takeaway