Real-time inference
Real-time inference is a method that responds immediately and synchronously with low latency, assuming a short processing time (up to about tens of seconds). The immediate response of a chatbot is a typical example.
It is not suited to large inputs that take several minutes to process, so it is incorrect.