A company wants to deploy a trained model for production use as low-latency real-time inference while minimizing the overhead of building and operating servers themselves. Which deployment approach BEST meets this requirement?