Optimizing Heroku Connect Performance
Last updated August 04, 2021
Sync performance varies widely from one connection to another. The best way to determine performance is testing with a production Salesforce org and a production database plan. This article shows how to optimize Heroku Connect sync speed. To improve the performance of your Connect instance, follow these tips for Heroku Connect, Heroku Postgres, and Salesforce.
This article assumes that you understand Heroku Connect basics. If you’re just getting started with Heroku Connect, see the main Heroku Connect page.
Use the Salesforce Bulk API for large datasets
Heroku Connect uses the Salesforce Bulk API or the Salesforce SOAP API. The Bulk API is faster for larger datasets.
Review the conditions of when each API is used and follow write patterns that use the Bulk API:
- Minimize the amount of synced changes.
- Save only changed records to mapped Heroku Connect tables. Saving unchanged records triggers unnecessary syncs.
- If you expect to update a record many times within a short period, make rapid changes in a staging table and move the record to the mapped table after it is stable. For example, in a Rails app, handle lookups and calculations in
before_savecallbacks rather than
after_save. Then all data is written to the database in one SQL statement.
- Insert only valid data. Failed inserts cause a fallback from using the Bulk API to the SOAP API. Watch out for these messages in your Heroku Connect logs, for example,
BULK write batch to MAPPING_NAME ended due to REASON Will fall back to SOAP.
For more info, see Bulk vs SOAP API.
Choose the best write algorithm for your use case
When Heroku Connect is configured with bi-directional sync, determine which write algorithm, Ordered Writes or Merged Writes, is efficient for your use case.
In general, if you make many rapid changes in succession, Merged Writes is better for performance. Avoid Merged Writes if you have circular dependencies or use Apex triggers or process rules in Salesforce. The Ordered Writes algorithm is a prerequisite for using the Bulk API. It is not possible to perform bulk writes to Salesforce when using the Merged Writes algorithm.
Review the potential pitfalls carefully for each algorithm to choose the best one for your use case.
- Potential Pitfalls for the Ordered Writes Algorithm
- Potential Pitfalls for the Merged Writes Algorithm
Minimize the size of the mapped object
Keep mapped objects small to minimize the size of requests to Salesforce and improve sync speed.
- Choose the minimum number of fields required for your use case. When Heroku Connect pulls data from Salesforce, it reads in the entire record at once. Performance can severely degrade when many fields are mapped.
- Minimize the number of large string and text fields in your mapping. The Salesforce API imposes restrictions on the size of responses or generated bulk results files. Mapped fields that are Long Text Area or Rich Text Area tend to hold large amounts of data. This means that Heroku Connect must make many more network requests to fetch your data than it would without those fields being mapped. The extra time spent querying Salesforce’s API decreases the performance of Heroku Connect for the mapping(s) containing these fields. Note that the performance penalty is only paid on mappings that have several of these large text area fields.
Heroku Connect Diagnostics helps you identify the number of mappings and number of fields per object that’s optimal for your connection. Run
heroku connect:diagnose and check the Number of Fields and Number of Large Text Area Fields diagnostic checks. If there are yellow or red flags for these checks, review your mappings and reduce the number of fields mapped if possible.
Avoid frequent changes to mappings
Adding a new field to a large table fires a bulk job to backfill the corresponding database column with the Salesforce data (unless the new field is empty). The regular sync process from Salesforce to Heroku is paused until the bulk load finishes, which slows down Heroku Connect.
For a table already mapped with a large number of records, data can appear stale while the new fields are synchronized.
Before changing a mapping, ensure that all stakeholders are notified. They must know about changes and the timeframe because updating mappings can result in delays when syncing new data from Salesforce to the database. Your stakeholders include developers, your Salesforce-side team, and other business teams that depend on this data.
Avoid Heroku Connect schema access
Store only tables that are synchronized by Connect in your Connect schema. Extra tables cause Connect to slow down while trying to process and ignore those captured changes.
Changing the definition of a mapped table, such as adding a new constraint, can cause trouble when Salesforce tries to sync or delete the data, and often results in sync issues. See Heroku Connect mapping is stuck in ‘Altering DB schema’ status for possible resolutions to these sync issues.
Limit the use of custom triggers with Heroku Connect. Implementing your own custom triggers on Heroku Connect tables can result in missed updates or create more database load.
Insert only valid data
Salesforce and Heroku Postgres have a similar constraint for failed INSERTs. An insert on Postgres that errors on the Salesforce side can seriously degrade speed because the request is moved from the Bulk API to the SOAP API. Review your logs for errors to correct your data. The Salesforce API documentation includes a full list of error codes and their explanations.
Choose a sufficient Heroku Postgres plan
Use at least a
shield-4 plan for production Connect use cases. Lower plans often experience performance issues because of:
- Insufficient I/O performance on plans lower than
shield-4. Even though some Postgres plans can burst to higher I/O performance when required, inconsistent performance can result. Use at least a
-4plan to provide sufficient I/O performance for Connect without requiring bursting.
- Connection limits. Some Postgres plans smaller than
-4have connection limits of under 500, which can lead to connection depletion.
Upgrade your database plan to scale with your Heroku Connect needs. More resources are used as more rows are synced. For larger Heroku Connect use cases:
- To sync 10-100 million rows, a
-6database is recommended.
- To sync more than 100 million rows to sync, use at least a
The number of rows synced is not the only factor that affects database performance. While these guidelines are suitable for most use cases, also monitor Postgres for performance issues to check if they are right for you. See the Monitoring Heroku Postgres article for further guidance.
Watch for Postgres performance issues
The health of your Postgres database affects Connect sync speed.
These performance issues can have particular effects on Connect:
- Unused indexes. Indexes incur a performance penalty during writes, so remove unused indexes to help ensure the best performance.
- Database bloat. Bloat is space allocated to a relation of dead tuples that has yet to be reclaimed. Investigate tables with bloat ratios higher than 10.
- Excessive connections. For a database with a high number of connections, use a connection pooler like PGBouncer or a separate database.
- Number of schemas. Fewer than 50 schemas is recommended. Too many schemas can affect Postgres performance and the ability to take successful logical backups.
Check issues with pg:diagnose and investigate any checks that come back as yellow or red. Review Monitoring Heroku Postgres and Heroku Postgres Database Tuning for more info on addressing performance issues.
Create a dedicated Salesforce integration user for Connect
Create a separate, dedicated integration user in Salesforce to use with Connect.
Grant the integration user View All Data permissions to eliminate Salesforce permission checks for requested records.
Employ the integration user for Connect only to improve sync performance. Some per-user limits, such as concurrent queries can slow sync speeds.
Minimize locks on Salesforce records
Minimizing locks on Salesforce records boosts Heroku Connect sync speed and reduces sync failures.
Apex Triggers, Process Builder, and Flows/Headless Workflows can result in locked Salesforce records. In these cases, Connect waits for the lock to release, delaying syncs. Use care when locking rows to ensure that you are not introducing deadlocks. Verify that you are using standard deadlock avoidance techniques by accessing tables and rows in the same order from all locations in an application. For more info, see Locking Records.