
UpCloud
Best for Developers seeking predictable pricing and ultra-fast storage for ML tasks.

Best for Managed AI Endpoints
Founded by the creator of PyTorch’s Caffe2, Lepton AI focuses on providing a highly optimized, developer-friendly platform for running AI models in production. It abstracts away the complexities of Kubernetes and CUDA optimization, offering a streamlined path from prototype to scalable API.
Lepton provides an integrated stack that maximizes GPU utilization, especially for inference. While you pay a premium over raw compute providers, the reduction in DevOps overhead and the out-of-the-box performance optimizations (like vLLM integration) make it highly cost-effective for enterprise engineering teams.
| GPU Models | A100, H100 |
| GPU Types | NVIDIA A100, NVIDIA H100 |
| Headquarters | San Francisco, CA |
| Founded | 2023 |
| Availability | Available Now |
| Website | lepton.ai ↗ |
💡 Pricing note: Rates shown are indicative. Final pricing depends on GPU model, reservation type (spot vs. on-demand), contract length, and region. Get an exact quote →
Lepton AI GPU cloud pricing starts from $1.00/hr depending on GPU type, reservation model (on-demand vs. spot vs. reserved), and region. Use the quote form to get exact pricing for your specific workload.
Lepton AI offers A100, H100 GPU instances. Availability varies by region and configuration. Contact the provider through ComputeStacker for current availability.
Lepton AI operates data centers in Asia, Europe, North America. Choosing a region close to your users minimises latency and can help with data residency compliance requirements.
Use the "Get a Quote" button on this page to submit your GPU requirements. ComputeStacker will forward your request to Lepton AI and other matching providers. You'll receive proposals within 24 hours — no commitment required.
Lepton AI offers high-performance GPU infrastructure suitable for large language model training and fine-tuning workloads. For large-scale distributed training, check the Specs tab for NVLink and InfiniBand interconnect availability.

Best for Developers seeking predictable pricing and ultra-fast storage for ML tasks.

Best for AI Inference, Image Generation, Fine-Tuning, Budget ML

Best for Enterprise IT requiring automated, isolated bare-metal servers with high bandwidth.