The number of input and output tokens processed
Correct. On-demand use of Bedrock is fundamentally pay-as-you-go pricing based on the number of input tokens and output tokens processed. You pay only for what you use.
A company is preparing a cost estimate for using Amazon Bedrock. For on-demand use of Bedrock, what is pricing mainly based on? Which option is MOST appropriate?
Choose the billing basis for Bedrock on-demand use.
The number of input and output tokens processed
Correct. On-demand use of Bedrock is fundamentally pay-as-you-go pricing based on the number of input tokens and output tokens processed. You pay only for what you use.
The running time of a provisioned endpoint
Time-based pricing reflects the idea of a reserved usage model such as Provisioned Throughput.
On-demand use is billed by the number of tokens for what you use, so this is incorrect.
The GPU hours used to train the model
GPU hours reflect the cost idea of training a model yourself.
On-demand inference on Bedrock does not involve training, and billing is by input/output token count, so this is incorrect.
The number of stored prompts
The number of stored prompts is not the billing basis for on-demand inference.
Billing is based on the number of input and output tokens processed, so this is incorrect.
On-demand use of Amazon Bedrock is fundamentally pay-as-you-go pricing based on the number of input tokens and output tokens processed. Longer prompts and longer responses use more tokens and cost more. To control cost, be mindful of the token volume of prompts and outputs.