Deep-dive on the Next Gen Platform. Join the Webinar!

Skip Navigation
Show nav
Dev Center
  • Get Started
  • Documentation
  • Changelog
  • Search
  • Get Started
    • Node.js
    • Ruby on Rails
    • Ruby
    • Python
    • Java
    • PHP
    • Go
    • Scala
    • Clojure
    • .NET
  • Documentation
  • Changelog
  • More
    Additional Resources
    • Home
    • Elements
    • Products
    • Pricing
    • Careers
    • Help
    • Status
    • Events
    • Podcasts
    • Compliance Center
    Heroku Blog

    Heroku Blog

    Find out what's new with Heroku on our blog.

    Visit Blog
  • Log inorSign up
Hide categories

Categories

  • Heroku Architecture
    • Compute (Dynos)
      • Dyno Management
      • Dyno Concepts
      • Dyno Behavior
      • Dyno Reference
      • Dyno Troubleshooting
    • Stacks (operating system images)
    • Networking & DNS
    • Platform Policies
    • Platform Principles
  • Developer Tools
    • Command Line
    • Heroku VS Code Extension
  • Deployment
    • Deploying with Git
    • Deploying with Docker
    • Deployment Integrations
  • Continuous Delivery & Integration (Heroku Flow)
    • Continuous Integration
  • Language Support
    • Node.js
      • Working with Node.js
      • Troubleshooting Node.js Apps
      • Node.js Behavior in Heroku
    • Ruby
      • Rails Support
      • Working with Bundler
      • Working with Ruby
      • Ruby Behavior in Heroku
      • Troubleshooting Ruby Apps
    • Python
      • Working with Python
      • Background Jobs in Python
      • Python Behavior in Heroku
      • Working with Django
    • Java
      • Java Behavior in Heroku
      • Working with Java
      • Working with Maven
      • Working with Spring Boot
      • Troubleshooting Java Apps
    • PHP
      • PHP Behavior in Heroku
      • Working with PHP
    • Go
      • Go Dependency Management
    • Scala
    • Clojure
    • .NET
      • Working with .NET
  • Databases & Data Management
    • Heroku Postgres
      • Postgres Basics
      • Postgres Getting Started
      • Postgres Performance
      • Postgres Data Transfer & Preservation
      • Postgres Availability
      • Postgres Special Topics
      • Migrating to Heroku Postgres
    • Heroku Key-Value Store
    • Apache Kafka on Heroku
    • Other Data Stores
  • AI
    • Working with AI
  • Monitoring & Metrics
    • Logging
  • App Performance
  • Add-ons
    • All Add-ons
  • Collaboration
  • Security
    • App Security
    • Identities & Authentication
      • Single Sign-on (SSO)
    • Private Spaces
      • Infrastructure Networking
    • Compliance
  • Heroku Enterprise
    • Enterprise Accounts
    • Enterprise Teams
    • Heroku Connect (Salesforce sync)
      • Heroku Connect Administration
      • Heroku Connect Reference
      • Heroku Connect Troubleshooting
  • Patterns & Best Practices
  • Extending Heroku
    • Platform API
    • App Webhooks
    • Heroku Labs
    • Building Add-ons
      • Add-on Development Tasks
      • Add-on APIs
      • Add-on Guidelines & Requirements
    • Building CLI Plugins
    • Developing Buildpacks
    • Dev Center
  • Accounts & Billing
  • Troubleshooting & Support
  • Integrating with Salesforce
  • Databases & Data Management
  • Best Practices for Heroku Streaming Data Connectors

Best Practices for Heroku Streaming Data Connectors

English — 日本語に切り替える

Last updated April 24, 2024

Table of Contents

  • Useful Posts
  • Points to Note

This article describes several notes around best practices for operating your streaming data connector. Apache Kafka Connect and Debezium support this integration, so you see references to these components when discussing configuration options.

Useful Posts

When Things Go Wrong (Debezium)

Points to Note

Create

  • Debezium only works with UTF-8 databases. If you’re using a different encoding, change it to UTF-8 before enabling Heroku Streaming Data Connectors. If you change to another encoding after the connector is created, the connector will stop processing events.
  • When a new connector is created, it connects to a Postgres database. This process could be an expensive operation and cause load on both the source Postgres database and the target Kafka cluster. You must take care to choose tables and columns accordingly. More details available in the documentation.

Usage

  • The “before” data in CDC events is only populated for the parts of the table that are part of the REPLICA IDENTITY. By default, in most situations, this means that only the primary key is reflected in the “before” data. More details available in the documentation.
  • Changing primary keys must be coordinated carefully to avoid issues with schema information. More details available in the documentation.
  • CDC events are designed to be at least once delivery. You must build and configure your Kafka consumers to handle redundant CDC events gracefully. You can see more events when things go wrong. For more information about duplicate events, see Why must consuming applications expect duplicate events? in the Debezium FAQ.
  • On UPDATE events, unchanged TOASTed values have a placeholder __debezium_unavailable_value. If you don’t account for this value, you end up using the placeholder as if it were the real value. Find more details in the documentation.
  • Database rows with large amounts of data can’t be produced into Kafka messages if they exceed the maximum message size for the topic (default: 1 MB). These changes are logged (internally), but the event isn’t produced (silently).
  • Kafka topics are created with more than 1 partition. As a result, change events don’t have global total ordering when being consumed. Change events for a specific row are totally ordered. For more information about ordering, see How are events for a database organized? in the Debezium FAQ.

Failure

  • If a connector stops replicating, the replication slot that tracks the connector’s progress prevents WAL from being deleted. If the Postgres database runs out of disk for WAL, the database stops entirely. Find more details in the documentation.
  • Failures can occur for many reasons, leading the connector to stop. Some include a network partition, an AWS issue, a bug in Kafka Connect or Debezium, or an issue with our control plane. It’s important that you monitor your database’s replication slots for these conditions.
  • In certain failure modes, we remove the replication slot to preserve the operational stability of the database. When conditions have cleared and the replication slot is created again, the connector creates a new replication slot and begins publishing events from that point of recovery. Change events that weren’t streamed to Kafka before the database went down are lost.

Paused Connectors

  • Connectors must not be left in a “paused” state for more than a few hours. Paused connectors prevent WAL from being deleted, which can put the primary database at risk. It’s better to destroy the connector than to leave it paused for a long period.
  • Change events that occur while a connector is paused are not guaranteed to make it into Kafka. If a failover happens (due to a system failure or a scheduled maintenance), change events after the connector was paused are lost.

Destroy

  • Deleting a table tombstones the corresponding Kafka topic, but doesn’t destroy it.
  • If you deprovision a connector, we don’t deprovision the associated Kafka topics. If you want to remove this data, then you must delete these topics yourself using the CLI: heroku kafka:topics:destroy <topic_name> --app <app_name>. Find more details in the documentation.

Keep reading

  • Databases & Data Management

Feedback

Log in to submit feedback.

Heroku Streaming Data Connectors Connecting Heroku Data Services to MuleSoft

Information & Support

  • Getting Started
  • Documentation
  • Changelog
  • Compliance Center
  • Training & Education
  • Blog
  • Support Channels
  • Status

Language Reference

  • Node.js
  • Ruby
  • Java
  • PHP
  • Python
  • Go
  • Scala
  • Clojure
  • .NET

Other Resources

  • Careers
  • Elements
  • Products
  • Pricing
  • RSS
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku Blog
    • Heroku News Blog
    • Heroku Engineering Blog
  • Twitter
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku
    • Heroku Status
  • Github
  • LinkedIn
  • © 2025 Salesforce, Inc. All rights reserved. Various trademarks held by their respective owners. Salesforce Tower, 415 Mission Street, 3rd Floor, San Francisco, CA 94105, United States
  • heroku.com
  • Legal
  • Terms of Service
  • Privacy Information
  • Responsible Disclosure
  • Trust
  • Contact
  • Cookie Preferences
  • Your Privacy Choices