Python Quick Start Guide for v1-chat-completions API
Last updated January 24, 2025
This article is a work in progress, or documents a feature that is not yet released to all users. This article is unlisted. Only those with the link can access it.
Table of Contents
The Heroku Managed Inference and Agent add-on is currently in pilot. The products offered as part of the pilot aren’t intended for production use and are considered as a Beta Service and are subject to the Beta Services terms at https://www.salesforce.com/company/legal/agreements.jsp.
Our Claude chat models (Claude 3.5 Sonnet Latest, Claude 3.5 Sonnet, Claude 3.5 Haiku, and Claude 3.0 Haiku) generate conversational completions for input messages. This guide describes how to use the v1-chat-completions API with Python.
Prerequisites
Before making requests, provision access to the model of your choice.
If it’s not already installed, install the Heroku CLI. Then install the Heroku AI plugin:
heroku plugins:install @heroku/plugin-ai
Attach a chat model to an app of yours:
# If you don't have an app yet, you can create one with: heroku create $APP_NAME # specify the name you want for your app (or skip this step to use an existing app you have) # Create and attach one of our chat models to your app, $APP_NAME: heroku ai:models:create -a $APP_NAME claude-3-5-sonnet --as INFERENCE # OR heroku ai:models:create -a $APP_NAME claude-3-haiku --as INFERENCE
Install the necessary
requests
package:pip install requests
Python Example Code
import requests
import json
import os
# Global variables for API endpoint, authorization key, and model ID from Heroku config variables
ENV_VARS = {
"INFERENCE_URL": None,
"INFERENCE_KEY": None,
"INFERENCE_MODEL_ID": None
}
# Assert the existence of required environment variables, with helpful messages if they're missing.
for env_var in ENV_VARS.keys():
value = os.environ.get(env_var)
assert value is not None, (
f"Environment variable '{env_var}' is missing. Set it using:\n"
f"export {env_var}=$(heroku config:get -a $APP_NAME {env_var})"
)
ENV_VARS[env_var] = value
def parse_chat_output(response):
"""
Parses and prints the API response for the chat completion request.
Parameters:
- response (requests.Response): The response object from the API call.
"""
if response.status_code == 200:
result = response.json()
print("Chat Completion:", result["choices"][0]["message"]["content"])
else:
print(f"Request failed: {response.status_code}, {response.text}")
def generate_chat_completion(payload):
"""
Generates a chat completion using the Stability AI Chat Model.
Parameters:
- payload (dict): dictionary containing parameters for the chat completion request
Returns:
- Prints the generated chat completion.
"""
# Set headers using the global API key
HEADERS = {
"Authorization": f"Bearer {ENV_VARS['INFERENCE_KEY']}",
"Content-Type": "application/json"
}
endpoint_url = ENV_VARS['INFERENCE_URL'] + "/v1/chat/completions"
response = requests.post(endpoint_url, headers=HEADERS, data=json.dumps(payload))
parse_chat_output(response=response)
# Example payload
payload = {
"model": ENV_VARS["INFERENCE_MODEL_ID"],
"messages": [
{ "role": "user", "content": "Hello!" },
{ "role": "assistant", "content": "Hi there! How can I assist you today?" },
{ "role": "user", "content": "Why is Heroku so cool?"}
],
"temperature": 0.5,
"max_tokens": 100,
"stream": False
}
# Generate a chat completion with the given payload
generate_chat_completion(payload)