August 21, 2024: Google Cloud Run Integrates Nvidia L4 GPUs - Google Cloud Run now supports Nvidia L4 GPUs, enabling rapid on-demand AI inference for generative models and other AI workloads. Available in preview in select regions, it offers fast token rates for models with up to 9 billion parameters, automatic scaling, and pay-per-use pricing. This enhancement reduces latency and ensures efficient resource use, benefiting applications such as chatbots, image generators, and video streaming. Early adopters like LOral report impressive low latency and responsiveness.