
D-Wave Leap
Best for Researchers and enterprise teams tackling massive, intractable optimization and logistical ML problems.

Best for Containerized AI Applications, Low-Latency Edge Inference, Global Web Apps
Fly.io changed the game for application hosting by pushing Docker containers to the edge, and they are doing the exact same thing for artificial intelligence. By introducing L40S and A100 machines to their global network, developers can now deploy serverless AI inference endpoints right next to their users across the globe.
If you are building a consumer-facing AI application where milliseconds matter, Fly.io allows you to spin up a global edge AI GPU deployment in minutes. Their custom networking stack handles all the complex routing, meaning your users in Tokyo hit an Asian GPU, while your users in London hit a European GPU, automatically.
| GPU Models | L40S, A100 |
| GPU Types | A100, L40S |
| Headquarters | Chicago, IL, USA |
| Founded | 2017 |
| Availability | Available Now |
| Website | fly.io ↗ |
💡 Pricing note: Rates shown are indicative. Final pricing depends on GPU model, reservation type (spot vs. on-demand), contract length, and region. Get an exact quote →
Fly.io GPU cloud pricing starts from $0.40/hr depending on GPU type, reservation model (on-demand vs. spot vs. reserved), and region. Use the quote form to get exact pricing for your specific workload.
Fly.io offers L40S, A100 GPU instances. Availability varies by region and configuration. Contact the provider through ComputeStacker for current availability.
Fly.io operates data centers in Asia Pacific, Australia, EU Central, US East, US West. Choosing a region close to your users minimises latency and can help with data residency compliance requirements.
Use the "Get a Quote" button on this page to submit your GPU requirements. ComputeStacker will forward your request to Fly.io and other matching providers. You'll receive proposals within 24 hours — no commitment required.
Fly.io offers high-performance GPU infrastructure suitable for large language model training and fine-tuning workloads. For large-scale distributed training, check the Specs tab for NVLink and InfiniBand interconnect availability.

Best for Researchers and enterprise teams tackling massive, intractable optimization and logistical ML problems.

Best for Serverless Inference

Best for Funded AI Startups, Y Combinator Companies, LLM Foundation Models