Table of Contents [expand]
Last updated October 24, 2025
Our model cards contain documentation for each available AI model.
Available Models
The Heroku Managed Inference and Agent add-on is hosted in two regions: us and eu. However, the add-on can be provisioned and accessed from apps in any Heroku region.
Each region offers slightly different models.
Region: us
| Model Documentation | Type | API Endpoint | Model Source | Description |
|---|---|---|---|---|
| Claude 4.5 Sonnet | text → text |
/v1/chat/completions | Anthropic | A state-of-the-art LLM optimized for enterprise apps that supports chat, tool-calling, and enhanced reasoning. |
| Claude 4.5 Haiku | text → text |
/v1/chat/completions | Anthropic | A state-of-the-art LLM that supports chat, tool-calling, and enhanced reasoning. |
| Claude 4 Sonnet | text → text |
/v1/chat/completions | Anthropic | An intelligent and detail-oriented LLM that supports chat, tool-calling, and enhanced reasoning. |
| Claude 3.7 Sonnet | text → text |
/v1/chat/completions | Anthropic | An intelligent and detail-oriented LLM that supports chat, tool-calling, and enhanced reasoning. |
| Claude 3.5 Sonnet Latest | text → text |
/v1/chat/completions | Anthropic | A fast and affordable LLM that supports chat and tool-calling. |
| Claude 3.5 Haiku | text → text |
/v1/chat/completions | Anthropic | An affordable and straightforward LLM that supports chat and tool-calling. |
| Amazon Nova Lite | text → text |
/v1/chat/completions | Amazon | A fast and cost-effective LLM. |
| Amazon Nova Pro | text → text |
/v1/chat/completions | Amazon | A high-performance LLM designed for complex tasks. |
| OpenAI gpt-oss-120b | text → text |
/v1/chat/completions | OpenAI | An open-weight LLM that supports chat and tool-calling. |
| Cohere Embed Multilingual | text → embedding |
/v1/embeddings | Cohere | A state-of-the-art embedding model that supports multiple languages and can be helpful for developing RAG (Retrieval Augmented Generation) search. |
| Stable Image Ultra | text → image |
/v1/images/generations | Stability AI | A state-of-the-art diffusion (image generation) model. |
Region: eu
| Model Documentation | Type | API Endpoint | Model Source | Description |
|---|---|---|---|---|
| Claude 4.5 Sonnet | text → text |
/v1/chat/completions | Anthropic | A state-of-the-art LLM that supports chat, tool-calling, and enhanced reasoning. |
| Claude 4.5 Haiku | text → text |
/v1/chat/completions | Anthropic | A state-of-the-art LLM that supports chat, tool-calling, and enhanced reasoning. |
| Claude 4 Sonnet | text → text |
/v1/chat/completions | Anthropic | An intelligent and detail-oriented LLM that supports chat, tool-calling, and enhanced reasoning. |
| Claude 3.7 Sonnet | text → text |
/v1/chat/completions | Anthropic | An intelligent and detail-oriented LLM that supports chat, tool-calling, and enhanced reasoning. |
| Claude 3 Haiku | text → text |
/v1/chat/completions | Anthropic | A fast and affordable LLM that supports chat and tool-calling. |
| Amazon Nova Lite | text → text |
/v1/chat/completions | Amazon | A fast and cost-effective LLM. |
| Amazon Nova Pro | text → text |
/v1/chat/completions | Amazon | A high-performance LLM designed for complex tasks. |
| Cohere Embed Multilingual | text → embedding |
/v1/embeddings | Cohere | A state-of-the-art embedding model that supports multiple languages and can be helpful for developing RAG (Retrieval Augmented Generation) search. |