OpenAI Compatibility for Chat Completions

Table of Contents [expand]

Get Started With the OpenAI SDK
Request Parameter Support
Response Parameter Support
Limitations
Additional Reading

Last updated October 24, 2025

Heroku’s Managed Inference and Agents Add-on API offers broad OpenAI compatibility that works for most use cases. You can use familiar OpenAI SDK patterns while gaining access to Heroku’s infrastructure, security, and specialized tools.

Get Started With the OpenAI SDK

Prerequisites

To use the OpenAI SDK with Heroku, you need:

We recommend always using the latest version of the OpenAI SDK.

Set Up a Heroku App to Use the OpenAI SDK

When you attach a model resource to your app using heroku ai:models:create, the add-on automatically adds some config vars to your app’s environment.

Include your add-on’s config vars in your code:

Set the base URL to the INFERENCE_URL for your add-on.
Set the API key to the INFERENCE_KEY for your add-on.
Set the model in the request to the INFERENCE_MODEL_ID.

Example Setup for Python OpenAI SDK

from openai import OpenAI
import os

api_key=os.getenv("INFERENCE_KEY")
api_url=os.getenv("INFERENCE_URL")
model=os.getenv("INFERENCE_MODEL_ID")

client = OpenAI(
   api_key=api_key,  # Your Managed Inference API key
   base_url=api_url + "/v1/"  # Managed Inference API endpoint, for example, https://us.inference.heroku.com
)

response = client.chat.completions.create(
   model=model, # Add-on plan name
   messages=[
       {"role": "system", "content": "You are a helpful assistant."},
       {"role": "user", "content": "What should I build today?"}
   ],
)

print(response.choices[0].message.content)

Request Parameter Support

Request Fields

Request Field	Supported?
model	Yes, use add-on plan name
messages	Yes
modalities	Yes, for type `text`
logprobs	Yes
max_completion_tokens	Yes
parallel_tool_calls	Yes for Anthropic and OpenAI models, ignored for nova models
reasoning_effort	Yes
stop	Yes
stream	Yes
stream_options	Yes
temperature	Yes
tool_choice	Yes*
tool_options	Yes
tools	Yes
top_p	Yes
n	Yes, but ignored
frequency_penalty	Yes, but ignored
logprobs	Yes, but ignored
audio	Yes, but ignored
logit_bias	Yes, but ignored
store	Yes, but ignored
metadata	Yes, but ignored
prediction	Yes, but ignored
presence_quality	Yes, but ignored
prompt_cache_key	Yes, but ignored
response_format	Yes, but ignored
safety_identifier	Yes, but ignored
service_tier	Yes, but ignored
store	Yes, but ignored
top_logprobs	Yes, but ignored
verbosity	Yes, but ignored
web_search_options	Yes, but ignored

Request Messages

Message Type	Supported?
Developer	Yes
System	Yes
User	Yes, except for audio and file content types
Assistant	Yes, except for audio content type
Tool	Yes

Tool Choice

The tool_choice parameter is supported, except for custom tool choice.

To learn more about using tool_choice, see the OpenAI docs.

Tools

Function type tools are supported. Custom tools are unsupported.

To learn more about using tools, see the OpenAI docs.

Response Parameter Support

Response Fields

Response Field	Supported?
choices	Yes
created	Yes
id	Yes
model	Yes
object	Yes
service_tier	No
system_fingerprint	Yes
usage	Yes

Choice Fields

Choice Field	Supported?
index	Yes
message	Yes, except for annotations and audio
logprobs	Yes
finish_reason	Yes

Usage Fields

Usage Field	Supported?
prompt_tokens	Yes
completion_tokens	Yes, except for annotations and audio
total_tokens	Yes

Other usage field types are unsupported.

Limitations

Extended Thinking

Claude 4.5 Sonnet, Claude 4.5 Haiku, Claude 4 Sonnet, and Claude 3.7 Sonnet support extended thinking. You can use the extensive extended_reasoning parameter we expose, or OpenAI’s reasoning_effort parameter (low, medium, high mapping to fixed reasoning budget tokens).

If you use the OpenAI client, you don’t see reasoning blocks in the response. Set include_reasoning to false.

Python Extended Thinking Example

response = client.chat.completions.create(
   model=model, # Must be Claude Sonnet 3.7 or 4
   messages=[
       {"role": "system", "content": "You are a helpful assistant."},
       {"role": "user", "content": "Who are you?"}
   ],
   extra_body={
       "extended_thinking": {
           "enabled": True,
           "budget_tokens": 2000,
           "include_reasoning": False
       }
   }
)

Allowing Ignored Parameters

The Managed Inference and Agents Add-on API returns an error for unrecognized parameters. To disable this error, set allow_ignored_params to true. Ignoring parameters is useful if you’re using an older version of the SDK with parameters that aren’t fully supported by Heroku.

Python Allowing Ignored Parameters Example

response = client.chat.completions.create(
   model=model, # Must be Claude Sonnet 3.7 or 4
   messages=[
       {"role": "system", "content": "You are a helpful assistant."},
       {"role": "user", "content": "Who are you?"}
   ],
   extra_body={
       "allow_ignored_params": True
   }
)

Limitations for Tool Calling

The content field must be populated to use tool calling. If content is empty, the API throws a validation error. To prevent empty tool responses, you can either ensure the tool always has a response or populate empty content fields with default values in your application code.

Categories