List of hosted inference providers

⚠️ Use prices only as an estimate—check the pricing pages on the official website!

Provider	Services	Custom LLM hosting	Custom LLM pricing	Shared models	Shared model pricing
beam.cloud	Training + Inference	By the second	$3.29/h (A100 40GB)	No	No
banana.dev	Inference	By the second	$7.49/h (A100 40GB)	No	No
fal.ai	Inference	By the second	$7.02/h (A100 40GB)	Lots	By the second
together.ai	Training + Inference	By the second	Based on model size	Lots	Per 1K tokens
replicate.com	Inference	By the second	$8.28/h (A100 40GB)	Lots	$4.14/h
brev.dev	Training + Inference	Hourly (?)	$1.10-3.67 (A100 40GB)	?	?
gradient.ai	Training + Inference	Per 1K tokens	Based on model size
titanml.co	Training + Inference	?	?	?	?
openpipe.ai	Training + Inference	Per 1K tokens	Based on model size	Same as custom	Same as custom
endpoints.anyscale.com	Inference	No	No	Llama 2, Code Llama	Per 1M tokens
endpoints.huggingface.co
modal.com
fireworks.ai
lepton.ai

For all options, latency and throughput are unknown factors.