Table of Contents [expand]
Last updated December 02, 2025
Amazon Nova Lite is a fast and cost-effective large language model (LLM). It offers a multimodal solution that can process image, video, and text inputs.
- Model ID:
nova-lite - Region:
us,eu
When to Use This Model
Amazon Nova Lite is optimized for high-throughput tasks and supports a variety of common use cases, including rapid text generation, summarization, and copywriting.
Usage
Amazon Nova Lite follows our /v1/chat/completions API schema.
To provision access to the model, attach nova-lite to your app example-app:
heroku ai:models:create -a example-app nova-lite
You can invoke nova-lite in a variety of ways:
- Heroku CLI
aiplugin (heroku ai:models:call) - curl
- Python
- Ruby
- Javascript
Rate Limits
- Maximum requests per minute: 150
- Maximum tokens per minute: 800,000
Prompt Caching
Prompt caching is supported for system prompts. It isn’t supported for tools. The minimum tokens required for prompt caching is 1,000.
Example curl Request
Get started quickly with an example request:
export INFERENCE_MODEL_ID=$(heroku config:get -a example-app INFERENCE_MODEL_ID)
export INFERENCE_KEY=$(heroku config:get -a example-app INFERENCE_KEY)
export INFERENCE_URL=$(heroku config:get -a example-app INFERENCE_URL)
curl $INFERENCE_URL/v1/chat/completions \
-H "Authorization: Bearer $INFERENCE_KEY" \
-d @- <<EOF
{
"model": "$INFERENCE_MODEL_ID",
"messages": [
{ "role": "user", "content": "Hello!" },
{ "role": "assistant", "content": "Hi there! How can I assist you today?" },
{ "role": "user", "content": "What's the weather like in Portland, Oregon right now?" }
],
"temperature": 0.5,
"max_tokens": 100,
"stream": false,
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Fetches the current weather for a given city.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city to get weather for."
}
},
"required": ["city"]
}
}
}
],
"tool_choice": "auto",
"top_p": 0.9
}
EOF