⚠️ Use prices only as an estimate—check the pricing pages on the official website!
Provider | Services | Custom LLM hosting | Custom LLM pricing | Shared models | Shared model pricing |
---|---|---|---|---|---|
beam.cloud | Training + Inference | By the second | $3.29/h (A100 40GB) | No | No |
banana.dev | Inference | By the second | $7.49/h (A100 40GB) | No | No |
fal.ai | Inference | By the second | $7.02/h (A100 40GB) | Lots | By the second |
together.ai | Training + Inference | By the second | Based on model size | Lots | Per 1K tokens |
replicate.com | Inference | By the second | $8.28/h (A100 40GB) | Lots | $4.14/h |
brev.dev | Training + Inference | Hourly (?) | $1.10-3.67 (A100 40GB) | ? | ? |
gradient.ai | Training + Inference | Per 1K tokens | Based on model size | ||
titanml.co | Training + Inference | ? | ? | ? | ? |
openpipe.ai | Training + Inference | Per 1K tokens | Based on model size | Same as custom | Same as custom |
endpoints.anyscale.com | Inference | No | No | Llama 2, Code Llama | Per 1M tokens |
endpoints.huggingface.co | |||||
modal.com | |||||
fireworks.ai | |||||
lepton.ai |
For all options, latency and throughput are unknown factors.