Throughput
Throughput is a performance metric that represents how much can be processed per unit of time.
It does not represent the upper limit on the number of tokens that can be handled at one time, so it is incorrect.
A legal team wants to feed an entire long contract to a model as a single input and have it summarized. What is the name of the property that represents how many tokens a model can take as input and produce as output at one time?
Choosing the property that represents the upper limit on tokens handled at one time.
Throughput
Throughput is a performance metric that represents how much can be processed per unit of time.
It does not represent the upper limit on the number of tokens that can be handled at one time, so it is incorrect.
Maximum response tokens
Maximum tokens is an inference parameter that specifies the upper limit on output length.
This question asks about the total capacity of input and output that a model can handle at one time (a property intrinsic to the model), which is the context window, so it is incorrect.
Vocabulary size
Vocabulary size is the number of unique words (token types) a model can handle.
It is about the richness of the vocabulary, not the upper limit on how much can be processed at one time, so it is incorrect.
Context window
Correct. The context window is the property that represents the amount of tokens a model can handle at one time (the upper limit on input plus output). Models with a large context window are well suited to handling long documents.
Remember the correct answer, the context window.
・The property that represents the amount of tokens a model can handle at one time (the upper limit on input plus output).
・The larger it is, the longer the document or the more conversation history can be passed in at once, which suits tasks such as long-text summarization.
Throughput (amount processed per unit of time), learning rate, and number of epochs are none of them the property that represents the upper limit on tokens handled at one time.