AI Models

The Heroku Managed Inference and Agent add-on supports the following models. The add-on is hosted in two regions: us and eu. However, the add-on can be provisioned and accessed from apps in any Heroku region. Select a model to view information on rate limits, prompt caching, and implementation.

Model Documentation Model ID Region Supported Inputs Supported Outputs API Endpoint Model Source Description
Claude Opus 4.5 claude-opus-4-5 US, EU text, image text /v1/chat/completions Anthropic A next-generation, frontier LLM that supports chat, tool-calling, autonomous coding, effort control, and enhanced reasoning.
Claude 4.5 Sonnet claude-4-5-sonnet US, EU text, image text /v1/chat/completions Anthropic A state-of-the-art LLM optimized for enterprise apps that supports chat, tool-calling, and enhanced reasoning.
Claude 4.5 Haiku claude-4-5-haiku US, EU text, image text /v1/chat/completions Anthropic A state-of-the-art LLM that supports chat, tool-calling, and enhanced reasoning.
Nova 2 Lite nova-2-lite US, EU text, image, video text /v1/chat/completions Amazon A fast and cost-effective LLM that supports conversational chat, tool-calling, and advanced reasoning with extended context.
Kimi K2 Thinking kimi-k2-thinking US text text /v1/chat/completions Moonshot AI An open-weight LLM that supports conversational chat, tool-calling, and chain-of-thought processing.
MiniMax M2 minimax-m2 US text text /v1/chat/completions MiniMax An open-weight LLM that supports conversational chat, tool-calling, and programming tasks.
Qwen3 Coder 480B qwen3-coder-480b US text text /v1/chat/completions Qwen An open-weight LLM that supports conversational chat, tool-calling, and agentic coding.
Qwen3 235B qwen3-235b US text text /v1/chat/completions Qwen An open-weight LLM that supports conversational chat, tool-calling, complex reasoning, and agentic coding.
Claude 4 Sonnet claude-4-sonnet US, EU text, image text /v1/chat/completions Anthropic An intelligent and detail-oriented LLM that supports chat, tool-calling, and enhanced reasoning.
Claude 3.7 Sonnet claude-3-7-sonnet US, EU text, image text /v1/chat/completions Anthropic An intelligent and detail-oriented LLM that supports chat, tool-calling, and enhanced reasoning.
Claude 3.5 Sonnet Latest claude-3-5-sonnet-latest US, EU text, image text /v1/chat/completions Anthropic A fast and affordable LLM that supports chat and tool-calling.
Claude 3.5 Haiku claude-3-5-haiku US, EU text, image text /v1/chat/completions Anthropic An affordable and straightforward LLM that supports chat and tool-calling.
Claude 3 Haiku claude-3-haiku EU text, image text /v1/chat/completions Anthropic A fast and affordable LLM that supports chat and tool-calling.
Nova Lite nova-lite US, EU text, image, video text /v1/chat/completions Amazon A fast and cost-effective LLM.
Nova Pro nova-pro US, EU text, image, video text /v1/chat/completions Amazon A high-performance LLM designed for complex tasks.
OpenAI gpt-oss-120b gpt-oss-120b US, EU text text /v1/chat/completions OpenAI An open-weight LLM that supports chat and tool-calling.
Cohere Embed Multilingual cohere-embed-multilingual US, EU text, image embedding /v1/embeddings Cohere A state-of-the-art embedding model that supports multiple languages and can be helpful for developing RAG search.
Stable Image Ultra stable-image-ultra US, EU text image /v1/images/generations Stability AI A state-of-the-art diffusion (image generation) model.
Cohere Rerank 3.5 cohere-rerank-3-5 US, EU text score /v1/rerank Cohere A reranking model that offers enhanced reasoning, broad data compatibility, and multilingual support.
Amazon Rerank 1.0 amazon-rerank-1-0 US, EU text score /v1/rerank Amazon A reliable, high-performing reranking model backed by AWS infrastructure.