Token
This is incorrect. A token is the processing unit into which text is divided for the model — split into words, sub-word strings such as individual morphemes, or symbols (for example, the word 'customer service' might be split into multiple tokens). LLMs process token sequences in order, and tokens are also the basis for measuring input/output volume and pricing. However, simply splitting text allows only surface-level comparison, so the semantic similarity of texts with different wording cannot be computed.