Best Practices for Heroku Streaming Data Connectors
Last updated May 12, 2023
This article describes several notes around best practices for operating your streaming data connector. Apache Kafka Connect and Debezium support this integration, so you see references to these components when discussing configuration options.
When Things Go Wrong (Debezium)
Points to Note
- Debezium only works with UTF-8 databases. If you’re using a different encoding, change it to UTF-8 before enabling Heroku Streaming Data Connectors. If you change to another encoding after the connector is created, the connector will stop processing events.
- When a new connector is created, it connects to a Postgres database. This process could be an expensive operation and cause load on both the source Postgres database and the target Kafka cluster. You must take care to choose tables and columns accordingly. More details available in the documentation.
- The “before” data in CDC events is only populated for the parts of the table that are part of the
REPLICA IDENTITY. By default, in most situations, this means that only the primary key is reflected in the “before” data. More details available in the documentation.
- Changing primary keys must be coordinated carefully to avoid issues with schema information. More details available in the documentation.
- CDC events are designed to be at least once delivery. You must build and configure your Kafka consumers to handle redundant CDC events gracefully. You can see more events when things go wrong. For more information about duplicate events, see Why must consuming applications expect duplicate events? in the Debezium FAQ.
UPDATEevents, unchanged TOASTed values have a placeholder
__debezium_unavailable_value. If you don’t account for this value, you end up using the placeholder as if it were the real value. Find more details in the documentation.
- Database rows with large amounts of data can’t be produced into Kafka messages if they exceed the maximum message size for the topic (default: 1 MB). These changes are logged (internally), but the event isn’t produced (silently).
- Kafka topics are created with more than 1 partition. As a result, change events don’t have global total ordering when being consumed. Change events for a specific row are totally ordered. For more information about ordering, see How are events for a database organized? in the Debezium FAQ.
- If a connector stops replicating, the replication slot that tracks the connector’s progress prevents WAL from being deleted. If the Postgres database runs out of disk for WAL, the database stops entirely. Find more details in the documentation.
- Failures can occur for many reasons, leading the connector to stop. Some include a network partition, an AWS issue, a bug in Kafka Connect or Debezium, or an issue with our control plane. It’s important that you monitor your database’s replication slots for these conditions.
- In certain failure modes, we remove the replication slot to preserve the operational stability of the database. When conditions have cleared and the replication slot is created again, the connector creates a new replication slot and begins publishing events from that point of recovery. Change events that weren’t streamed to Kafka before the database went down are lost.
- Connectors must not be left in a “paused” state for more than a few hours. Paused connectors prevent WAL from being deleted, which can put the primary database at risk. It’s better to destroy the connector than to leave it paused for a long period.
- Change events that occur while a connector is paused are not guaranteed to make it into Kafka. If a failover happens (due to a system failure or a scheduled maintenance), change events after the connector was paused are lost.
- Deleting a table tombstones the corresponding Kafka topic, but doesn’t destroy it.
- If you deprovision a connector, we don’t deprovision the associated Kafka topics. If you want to remove this data, then you must delete these topics yourself using the CLI:
heroku kafka:topics:destroy <topic_name> --app <app_name>. Find more details in the documentation.