Heroku's streaming data connectors
Last updated 27 July 2020
Table of Contents
This is a Beta feature. Any use of Beta Services is subject to the terms in your Master Subscription Agreement and the Beta Services terms. These terms include provisions that the following types of sensitive Personal Data (including images, sounds or other information containing or revealing such sensitive data) may not be submitted to Data Science Programs, Non-GA Service and Non-GA Software: government-issued identification numbers; financial information (such as credit or debit card numbers, any related security codes or passwords, and bank account numbers); racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, information concerning health or sex life; information related to an individual’s physical or mental health; and information related to the provision or payment of health care.
This article describes how to configure Change Data Capture (CDC) for Heroku Postgres events and stream them to your Apache Kafka on Heroku add-on provisioned in a Private Space or a Shield Private Space. This process involves three high-level steps:
- Creating an app in Private Space or Shield Private Space.
- Provisioning a Heroku Postgres add-on and a Apache Kafka on Heroku add-on on your new app
- Creating a streaming data connector to enable CDC events from your Postgres to your Kafka
Heroku App Setup
To begin, you will need to create a Private or Shield Private Space. Once your Space is available, you can create an app in your Space.
$ heroku spaces:create --region virginia --team my-team-name --space myspace $ heroku spaces:wait --space myspace $ heroku apps:create --space myspace my-cdc-app
Heroku Add-ons Setup
Next, you will need two data add-ons attached to your app.
Your Postgres add-on will need to be version 10 or higher. Your Kafka add-on will need to be version 2.3 or higher.
$ heroku addons:create heroku-postgresql:private-7 --as DATABASE --app my-cdc-app $ heroku addons:create heroku-kafka:private-extended-2 --as KAFKA --app my-cdc-app
You can monitor the add-on provisioning progress:
$ heroku addons:wait --app my-cdc-app
Once your add-ons are available, you will need to import your schema and/or data into your Postgres database.
Heroku’s Streaming Data Connector Setup
Once you have a Private or Shield Private Space App with Heroku Postgres and Apache Kafka on Heroku add-ons configured, you can provision a connector.
First, you will need to install the CLI plugin:
$ heroku plugins:install @heroku-cli/plugin-data-connectors
To create a connector, you will need to gather several pieces of information.
- The name of the Kafka add-on
- The name of the Postgres add-on
- The name(s) of the Postgres tables from which you want to capture events
- (optionally) The name(s) of the columns you wish to exclude from capture events
In order to capture events in your Postgres database, a few requirements must be met:
- The database encoding must be UTF-8
- The table(s) must currently exist
- The table(s) must have a primary key
- The table name(s) must only contain the characters
- The Kafka Formation needs to have direct Zookeeper access disabled
You will want to take care in choosing the what tables to capture. A single connector may not be able to keep up with a high volume of events from many tables.
Next, you can create the connector. You will need the names of your Postgres and Kafka add-ons, as well as a list of fully qualified tables you want to include in your database capture events:
$ heroku data:connectors:create \ --source postgresql-neato-98765 \ --store kafka-lovely-12345 \ --table public.posts --table public.users
Provisioning can take approximately 15-20 minutes to complete. You can monitor the connector provisioning progress:
$ heroku data:connectors:wait gentle-connector-1234
Once your connector is available, you can view the details including newly created Kafka topics:
$ heroku data:connectors:info gentle-connector-1234 === Data Connector status for gentle_connector_1234 Name: gentle_connector_1234 Status: available === Configuration Table Name Topic Name public.posts gentle_connector_1234.public.posts public.users gentle_connector_1234.public.users
Managing a Connector
Once you have created your connector, there a few options available for managing it.
Pause or Resume
# to pause $ heroku data:connectors:pause gentle-connector-1234 # to resume $ heroku data:connectors:resume gentle-connector-1234
You can modify certain properties associated with your connector via the CLI. These include:
|property||possible values||default value||details|
For example, you can update the
$ heroku data:connectors:update gentle-connector-1234 \ --setting tombstones.on.delete=false
It is recommended that you familiarize yourself with our recommended Best Practices when working with connectors.
Destroying a Connector
You can destroy a connector via the CLI.
This will not destroy the Kafka topics used to produce events. You will need to manage their lifecycle independently.
$ heroku data:connectors:destroy gentle-connector-1234