Managed Inference and Agent API with Latest Claude 3.5 Sonnet
Last updated January 24, 2025
This article is a work in progress, or documents a feature that is not yet released to all users. This article is unlisted. Only those with the link can access it.
Table of Contents
The Heroku Managed Inference and Agent add-on is currently in pilot. The products offered as part of the pilot aren’t intended for production use and are considered as a Beta Service and are subject to the Beta Services terms at https://www.salesforce.com/company/legal/agreements.jsp.
Claude 3.5 Sonnet is a text-to-text large language model (LLM) in Anthropic’s Claude 3.5 family, supporting both conversational chat and tool-calling capabilities. It offers advanced intelligence, speed, and cost-efficiency, outperforming previous models like Claude 3 Opus.
This version - claude-3-5-sonnet-latest
- indicates that the model version is slightly newer than the claude-3-5-sonnet
model offered in the EU region, with slightly better performance.
- Model ID:
claude-3-5-sonnet-latest
- Region:
us
When to Use This Model
Claude 3.5 Sonnet is well-suited for sophisticated code generation, complex chat interactions, and orchestrating multi-step workflows. It’s more expensive, but generally more intelligent and detail-oriented than Claude 3.0 Haiku.
Usage
Claude 3.5 Sonnet follows our Claude v1/chat/completions API schema.
To provision access to the model, attach claude-3-5-sonnet-latest
to your app $APP_NAME
:
heroku ai:models:create -a $APP_NAME claude-3-5-sonnet-latest
Using config variables, you can invoke claude-3-5-sonnet-latest
in a variety of ways:
- Heroku CLI
ai
plugin (heroku ai:models:call
) - curl
- Python
- Ruby
- Javascript
Example curl Request
Get started quickly with an example request:
export INFERENCE_MODEL_ID=$(heroku config:get -a $APP_NAME INFERENCE_MODEL_ID)
export INFERENCE_KEY=$(heroku config:get -a $APP_NAME INFERENCE_KEY)
export INFERENCE_URL=$(heroku config:get -a $APP_NAME INFERENCE_URL)
curl $INFERENCE_URL/v1/chat/completions \
-H "Authorization: Bearer $INFERENCE_KEY" \
-d @- <<EOF
{
"model": "$INFERENCE_MODEL_ID",
"messages": [
{ "role": "user", "content": "Hello!" },
{ "role": "assistant", "content": "Hi there! How can I assist you today?" },
{ "role": "user", "content": "What's the weather like in Portland, Oregon right now?" }
],
"temperature": 0.5,
"max_tokens": 100,
"stream": false,
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Fetches the current weather for a given city.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city to get weather for."
}
},
"required": ["city"]
}
}
}
],
"tool_choice": "auto",
"top_p": 0.9
}
EOF