Managed Inference and Agents API with Claude 4.5 Haiku

Table of Contents [expand]

When to Use This Model
Usage
Example curl Requests

Last updated February 12, 2026

Claude 4.5 Haiku is a large language model (LLM) in Anthropic’s Claude family that supports conversational chat, tool-calling, and enhanced reasoning for complex tasks with extended thinking. It offers a fast, intelligent, and cost-effective solution you can use to power your AI applications.

Model ID: claude-4-5-haiku
Region: us, eu

When to Use This Model

Claude 4.5 Haiku supports a variety of common use cases, including rapid responses, content moderation, and inventory management. It’s optimized for high-throughput tasks and real-time interactions.

Usage

Claude 4.5 Haiku follows our Claude /v1/chat/completions API schema.

To provision access to the model, attach a Managed Inference and Agents add-on add-on to your app $APP_NAME:

heroku addons:create heroku-inference:standard -a $APP_NAME

Using config variables, you can invoke the model in various ways:

Heroku CLI ai plugin (heroku ai:models:call)
curl
Python
Ruby
Javascript

Multimodal Support

Supported inputs: text, image
Supported outputs: text

Rate Limits

Maximum requests per minute: 200
Maximum tokens per minute: 800,000

Prompt Caching

Prompt caching is supported for system prompts and tools. The minimum tokens required for prompt caching is 4,096.

Example curl Requests

Text to Text

export INFERENCE_KEY=$(heroku config:get -a $APP_NAME INFERENCE_KEY)
export INFERENCE_URL=$(heroku config:get -a $APP_NAME INFERENCE_URL)

curl $INFERENCE_URL/v1/chat/completions \
-H "Authorization: Bearer $INFERENCE_KEY" \
-d @- <<EOF
{
  "model": "claude-4-5-haiku",
  "messages": [
    { "role": "user", "content": "Hello!" },
    { "role": "assistant", "content": "Hi there! How can I assist you today?" },
    { "role": "user", "content": "What's the weather like in Portland, Oregon right now?" }
  ],
  "temperature": 0.5,
  "max_tokens": 100,
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Fetches the current weather for a given city.",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string",
              "description": "The name of the city to get weather for."
            }
          },
          "required": ["city"]
        }
      }
    }
  ],
  "tool_choice": "auto",
  "top_p": 0.9
}
EOF

Image to Text

curl -X POST $INFERENCE_URL/v1/chat/completions \
  -H "Authorization: Bearer $INFERENCE_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Forwarded-Proto: https" \
  -d @- <<EOF
{
  "model": "claude-4-5-haiku",
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "What do you see in this image?"},
      {"type": "image_url", "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/0/09/A_chinstrap_penguin_%28Pygoscelis_antarcticus%29_on_Deception_Island_in_Antarctica.jpg/960px-A_chinstrap_penguin_%28Pygoscelis_antarcticus%29_on_Deception_Island_in_Antarctica.jpg"}}
    ]
  }]
}
EOF

Categories