LlamaIndex AI Integration

Table of Contents [expand]

Installation and Setup
Using the Integration
Error Handling
Additional Resources

Last updated December 09, 2025

LlamaIndex is a data framework that enables you to build context-augmented large language model (LLM) applications. You can use LlamaIndex for various use cases, including prompting, chatbots, structured data extraction, and agentic workflows.

This integration enables you to use AI models deployed on Heroku’s infrastructure in your LlamaIndex apps.

Installation and Setup

To install the integration, run:

pip install llama-index-llms-heroku

To set up LlamaIndex:

Create an app in Heroku:
```
heroku create example-app
```

Create and attach a chat model to your app:

heroku ai:models:create -a example-app claude-3-5-haiku

Export configuration variables:

export INFERENCE_KEY=$(heroku config:get INFERENCE_KEY -a example-app)
export INFERENCE_MODEL_ID=$(heroku config:get INFERENCE_MODEL_ID -a example-app)
export INFERENCE_URL=$(heroku config:get INFERENCE_URL -a example-app)

Using the Integration

Available Models

For a complete list of available models, see Managed Inference and Agents API Model Cards.

Chat Completion Example

from llama_index.llms.heroku import Heroku
from llama_index.core.llms import ChatMessage, MessageRole

# Initialize the Heroku LLM
llm = Heroku()

# Create chat messages
messages = [
    ChatMessage(
        role=MessageRole.SYSTEM, content="You are a helpful assistant."
    ),
    ChatMessage(
        role=MessageRole.USER,
        content="What are the most popular house pets in North America?",
    ),
]

# Get response
response = llm.chat(messages)
print(response)

Using Environment Variables

The integration automatically reads environment variables:

import os

# Set environment variables
os.environ["INFERENCE_KEY"] = "your-inference-key"
os.environ["INFERENCE_URL"] = "https://us.inference.heroku.com"
os.environ["INFERENCE_MODEL_ID"] = "claude-3-5-haiku"

# Initialize without parameters
llm = Heroku()

Parameters

You can pass parameters directly:

import os

llm = Heroku(
    model=os.getenv("INFERENCE_MODEL_ID", "claude-3-5-haiku"),
    api_key=os.getenv("INFERENCE_KEY", "your-inference-key"),
    inference_url=os.getenv(
        "INFERENCE_URL", "https://us.inference.heroku.com"
    ),
    max_tokens=1024,
)

Text Completion Example

response = llm.complete("Explain the importance of open source LLMs")
print(response.text)

Categories