Managed Inference and Agents API with OpenAI gpt-oss-120b

Table of Contents [expand]

When to Use This Model
Usage

Last updated December 08, 2025

OpenAI’s gpt-oss-120b is an open-weight, text-to-text large language model (LLM) that supports both conversational chat and tool-calling. It offers a powerful, transparent, and accessible solution you can customize to run your AI workflows.

Model ID: gpt-oss-120b
Region: us, eu

When to Use This Model

OpenAI’s gpt-oss-120b is well-suited for natural language understanding, code generation, and complex problem-solving. It’s open-weight and can be fine-tuned to support various use cases.

Usage

OpenAI’s gpt-oss-120b follows our /v1/chat/completions API schema.

To provision access to the model, attach openai-gpt-oss-120b to your app $APP_NAME:

heroku ai:models:create -a $APP_NAME openai-gpt-oss-120b

You can invoke openai-gpt-oss-120b in a variety of ways:

Heroku CLI ai plugin (heroku ai:models:call)
curl
Python
Ruby
Javascript

Rate Limits

Maximum requests per minute: 200
Maximum tokens per minute: 800,000

Prompt Caching

Prompt caching is not supported for gpt-oss-120b.

Example curl Request

Get started quickly with an example request:

export INFERENCE_MODEL_ID=$(heroku config:get -a example-app INFERENCE_MODEL_ID)
export INFERENCE_KEY=$(heroku config:get -a example-app INFERENCE_KEY)
export INFERENCE_URL=$(heroku config:get -a example-app INFERENCE_URL)

curl $INFERENCE_URL/v1/chat/completions \
-H "Authorization: Bearer $INFERENCE_KEY" \
-d @- <<EOF
{
  "model": "$INFERENCE_MODEL_ID",
  "messages": [
    { "role": "user", "content": "Hello!" },
    { "role": "assistant", "content": "Hi there! How can I assist you today?" },
    { "role": "user", "content": "What's the weather like in Portland, Oregon right now?" }
  ],
  "temperature": 0.5,
  "max_tokens": 100,
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Fetches the current weather for a given city.",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string",
              "description": "The name of the city to get weather for."
            }
          },
          "required": ["city"]
        }
      }
    }
  ],
  "tool_choice": "auto",
  "top_p": 0.9
}
EOF

Categories