Managed Inference and Agent API Model Cards
Last updated December 18, 2024
This article is a work in progress, or documents a feature that is not yet released to all users. This article is unlisted. Only those with the link can access it.
Table of Contents
The Heroku Managed Inference and Agent add-on is currently in pilot. The products offered as part of the pilot aren’t intended for production use and are considered as a Beta Service and are subject to the Beta Services terms at https://www.salesforce.com/company/legal/agreements.jsp.
Our model cards contain documentation for each available AI model.
Available Models
The Heroku Managed Inference and Agent add-on is hosted in two regions: us
and eu
. However, the add-on can be provisioned and accessed from apps in any Heroku region.
Each region offers slightly different models.
Region: us
Model Documentation | Type | API Endpoint | Model Source | Description |
---|---|---|---|---|
claude-3-5-sonnet-latest | text → text |
v1/chat/completions | Anthropic | A state-of-the-art large language model that supports chat and tool-calling. |
claude-3-5-haiku | text → text |
v1/chat/completions | Anthropic | A faster, more affordable large language model that supports chat and tool-calling. |
cohere-embed-multilingual | text → embedding |
v1/embeddings | Cohere | A state-of-the-art embedding model that supports multiple languages. This model is helpful for developing RAG (Retrieval Augmented Generation) search. |
stable-image-ultra | text → image |
v1/images/generations | Stability AI | A state-of-the-art diffusion (image generation) model. |
Region: eu
Model Documentation | Type | API Endpoint | Model Source | Description |
---|---|---|---|---|
claude-3-5-sonnet | text → text |
v1/chat/completions | Anthropic | A state-of-the-art large language model that supports chat and tool-calling. |
claude-3-haiku | text → text |
v1/chat/completions | Anthropic | A faster, more affordable large language model that supports chat and tool-calling. |
cohere-embed-multilingual | text → embedding |
v1/embeddings | Cohere | A state-of-the-art embedding model that supports multiple languages. This model is helpful for developing RAG (Retrieval Augmented Generation) search. |