Heroku Connect FAQ and Best Practices
Last updated 26 March 2019
Table of Contents
- Which Salesforce objects are supported?
- Why isn’t Connect allowing me to map an Object or Field?
- What does resuming the connection do?
- What does reloading a mapping do?
- What does aborting a mapping do?
- Why were rows still synchronized after I paused my connection?
- Can I synchronize a subset of data from Salesforce?
- Why do several large text fields cause synchronization performance problems?
- Can I use Sharing Rules to restrict record visibility?
- Why aren’t all of my Events or Tasks being synchronized?
- Why is there a difference between the Salesforce and database row counts for my mapping?
- Why are some objects grayed out on the 'Create mapping’ page?
- What happens to my data if I delete my application?
- How do I change the recipients of notifications?
- Can Heroku Connect sync images from Salesforce?
- How can I get my Salesforce assignment rules to run for Leads and Cases?
- I inserted or updated a record in my database. Why didn’t my change appear in Salesforce?
- Can I retry records that failed to write to Salesforce?
- How should I import large quantities of data into read/write mappings?
- Why does some of my data in Postgres not match what I wrote to Salesforce?
- What kind of sync throughput performance can I expect?
- How can I preview new features?
- Does Heroku Connect have an API?
- How can I ensure my own custom Postgres triggers work well with Connect?
- Is the use of Amazon RDS with Heroku Connect supported?
- I can’t find certain records by their 15-character Salesforce ID
- How many Postgres connections does Heroku Connect require?
- Can I create more than one Heroku Connect add-on per app/database?
- Is Heroku Connect compatible with Shield Private Spaces and Shield Postgres?
- Can I store tables in my Connect schema that aren’t synchronized with Salesforce?
Which Salesforce objects are supported?
The Supported Salesforce Standard Objects page lists the standard objects that Heroku Connect supports, along with noted limitations, for a specific version of Salesforce.
Why isn’t Connect allowing me to map an Object or Field?
The most common reason for Connect not showing specific Objects or Fields upon creating or editing a Mapping, is insufficient permissions for the user used to authorize the connection. Check the permissions for that user to see if this is the case. If it is, you will need to modify those permissions in Salesforce to resolve this.
What does resuming the connection do?
Resuming a connection is a non-destructive action that allows Heroku Connect to recover from unexpected errors. Processes are restarted and will resume synchronization from the time of the last successful poll.
To resume a connection navigate to the Settings tab, Manage Connection and click the Resume button.
What does reloading a mapping do?
Reloading a mapping is a destructive action that cannot be undone.
Reloading a mapping truncates the mapped table in your Postgres database, removing all existing data, and then reloads all records from Salesforce. The process used to reload the data is the same as that used when a mapping is initially populated and will make use of the SOAP or Bulk API depending on the number of records to be reloaded. It’s important to note that this operation will result in assignment of new values for the ‘ID’ column.
For mappings with a large number of records you should consider putting your application into maintenance mode while the process completes.
If a mapping has entered a
SYSTEM_ERROR state you should try to recover by resuming the connection rather than reloading.
What does aborting a mapping do?
Aborting a mapping will cancel the currently-running operation on that mapping, allowing for other actions to take place. This can be useful if an operation is running longer than expected, or was started accidentally. The result of an aborted operation can depend on what operation was canceled.
- For routine sync operations, Heroku Connect keeps metadata about its current progress, which allows it to recover well from an aborted poll.
- An initial load that’s small enough to not engage the Bulk API will also be able to pick up syncing data from where the initial load left off.
- If an initial load is large enough to engage the Bulk API, the order of records added to the database isn’t guaranteed, so canceling it will leave the mapping in an ABORTED state. There’s no way to automatically recover from this state, so the only options will be to reload the mapping from the beginning or delete it.
- Changes to your database schema also expect to be completed all together, so if you abort an operation to add, remove or change a column, the mapping will also end in an ABORTED state, with the same options for recovery.
Why were rows still synchronized after I paused my connection?
When you pause synchronization Heroku Connect will finish processing any active synchronization tasks. Any further changes made either in your database or in Salesforce after you pause will not be synchronized until you resume your connection.
Can I synchronize a subset of data from Salesforce?
Heroku Connect does not provide a mechanism for partially syncing data as doing so can cause issues when trying to keep track of which records should or shouldn’t be synchronized to your database, leading to mappings that can drift out of sync with Salesforce.
Why do several large text fields cause synchronization performance problems?
If you have a number of fields mapped that are Long Text Area fields or Rich Text Area fields, these tend to hold large amounts of data. The Salesforce API imposes restrictions on the size of responses or generated bulk results files. This means that Heroku Connect will need to make many more network requests to fetch your data than it would without those fields being mapped. The extra time spent querying Salesforce’s API will decrease the performance of Heroku Connect for the mapping(s) containing these fields.
The performance penalty is only paid on mappings that have several of these large text area fields.
Can I use Sharing Rules to restrict record visibility?
You may be considering Sharing Rules as a workaround to ensure Connect can only synchronize a subset of data. However, this approach is unsupported as it can cause mappings to become out of sync and can have an adverse impact on performance.
Sharing Rules do not provide any form of notification mechanism when records move in or out of scope, potentially causing them to be orphaned either in your database or Salesforce. For example, if Heroku Connect has synced a record that later moves out of the scope of the Sharing Rule, Connect will no longer have access to it via the Salesforce APIs, but it will still exist in the database.
Heroku Connect performs best with “View All” permissions on each mapped object as this removes the need for Salesforce APIs to carry out permissions checks on each record being accessed. Removing this permission from the integration user in order to use Sharing Rules can have drastic performance implications when querying and updating data in your Salesforce organization.
Why aren’t all of my Events or Tasks being synchronized?
Event and Task objects can be archived by Salesforce resulting in the
IsArchived flag being set. Unfortunately the Salesforce Bulk API does not allow archived Events and Tasks to be retrieved, which in turn means Heroku Connect cannot synchronize them.
Why is there a difference between the Salesforce and database row counts for my mapping?
Row counts can differ between Salesforce and your database for a number of reasons, including:
- Errors reading or writing data: for example due to validation errors in Salesforce or constraints in your database.
- Records are no longer visible to Heroku Connect: for example archived Events and Tasks.
- The Salesforce row count will sometimes be an estimate: to improve query performance for mapped objects with a high number of records (greater than one million), Heroku Connect will use a Salesforce API that provides row count estimates rather than the exact number.
- The database row count will sometimes be an estimate: to improve query performance for mapped tables with a high number of records (in the order of several million) Heroku Connect will not perform a full row count after every sync operation but will instead use an estimation method that can be calculated very quickly.
- Records changed/created up to 10 minutes before an initial load/reload or created after this time but before the first poll will not be accounted for to ensure that changes in your Salesforce Org that are in-flight at the time of the initial load/reload are accounted for. These changes will be picked up on the next Salesforce poll operation.
Why are some objects grayed out on the 'Create mapping’ page?
The object may not be supported by Heroku Connect (due to Salesforce API limitations), or the object has recently been removed and the removal operation is still underway.
What happens to my data if I delete my application?
Heroku Connect is an application-specific add-on (it cannot be shared): this means that when you delete the application, the add-on and the associated data from the database will also be deleted. If the Heroku Postgres database is attached to other applications then the database itself will not be deleted, but the mapped tables used by Connect will be removed.
How do I change the recipients of notifications?
In the Heroku Connect dashboard navigate to the Settings tab, Manage Connection and check or uncheck the Notifications Enabled box next to the relevant users in the Access list. For more information see Errors with associated notifications.
Can Heroku Connect sync images from Salesforce?
Heroku Connect does not currently support images or other types of binary data. An alternative approach is to store images in Amazon S3 and sync a simple text field containing a URI for the image between Salesforce and your database.
How can I get my Salesforce assignment rules to run for Leads and Cases?
By default, when Heroku Connect writes data to Salesforce, assignment rules aren’t run. If you’d like to enable your default assignment rules for Leads and Cases, please enter a support ticket. If you have multiple connections, be sure to mention them so that we can enable the setting for each one.
I inserted or updated a record in my database. Why didn’t my change appear in Salesforce?
The most common error that occurs when pushing updates from your database to Salesforce is insufficient write permissions for the target Salesforce org.
You can find more details by looking at the record in the Explorer or the
_trigger_log table in your database. To find errors in the
_trigger_log, you can run this SQL from
SELECT table_name, record_id, action, sf_message from salesforce._trigger_log where state = 'FAILED';
Can I retry records that failed to write to Salesforce?
If an error occurs when synchronizing data to Salesforce the relevant entries in the
_trigger_log system table will be set to a
Once you have investigated the cause of the error it is possible to force Heroku Connect to reprocess these updates by resetting the
state field to
For example to retry all failed entries for a mapping named
Contact use the following query:
UPDATE salesforce._trigger_log SET state = 'NEW' WHERE state = 'FAILED' AND table_name = 'contact';
The above example uses the default Postgres schema name of
salesforce - you will need to change this if you are using a different schema. The
table_name field in the
_trigger_log table does not include the schema name.
Trigger log entries older than 24 hours are moved to a
_trigger_log_archive table. If you need to resubmit entries that have been moved to the archive, you can copy them back to the
-- Set the records you want to retry to a state of 'REQUEUE' UPDATE salesforce._trigger_log_archive SET state='REQUEUE' WHERE state='NEW'; -- or some other criteria like state='FAILED' and table_name = 'account' and sf_message like '%my error message%' -- Copy them to the trigger log table INSERT INTO salesforce._trigger_log (created_at, updated_at, state, txid, table_name, action, record_id, values, old, sfid) SELECT now(), now(), 'NEW', txid, table_name, action, record_id, values, old, sfid FROM salesforce._trigger_log_archive where state='REQUEUE'; -- Update the records still in the archive UPDATE salesforce._trigger_log_archive SET state='REQUEUED' WHERE state='REQUEUE';
Heroku Connect periodically runs a 'sweeper’ task to find old
_trigger_log entries with a
NEW state and reprocesses them: this process currently runs every 20 minutes.
An alternative approach for records that have never been successfully written to Salesforce is to query for records with no Salesforce ID and a
FAILED state, for example:
SELECT * FROM salesforce.contact WHERE sfid IS NULL AND _hc_lastop='FAILED';
Updating a field in the mapped table will generate a new entry in the
_trigger_log to insert the record into Salesforce.
How should I import large quantities of data into read/write mappings?
Connect is able to use the Salesforce Bulk API to write changes to Salesforce provided a small set of requirements are met. To learn more about how and when Connect leverages the bulk API, see this article in Dev Center. Another option is to import the data directly into Salesforce using a tool such as Data Loader.
Why does some of my data in Postgres not match what I wrote to Salesforce?
There are a few scenarios where data in Postgres might not exactly match what was written to Salesforce:
- Salesforce trims extra spaces from the start and end of any strings that are entered. This is done before Heroku Connect ever gets the value, and can be seen if you enter a value surrounded by spaces directly in the Salesforce UI and immediately view the value afterward.
- If a string value is completely empty, the Salesforce API will send it as NULL, and therefore that’s how it will be stored in Postgres. This is a feature of the Salesforce API, which Heroku Connect has no control over.
- Number values are stored in Postgres as a
doublecolumn, which has 15-digit precision. Salesforce values that exceed this limitation will still sync via Heroku Connect, but with reduced precision. This is particularly noticeable with extremely large values.
What kind of sync throughput performance can I expect?
Sync performance can vary widely from one connection to another. The best way to determine performance is to test with a production Salesforce org and a production database plan. Specifically, a few factors can affect overall performance:
- Object size can have a big impact on sync performance. Mapping many columns on a particular object causes Salesforce to sync those objects in smaller batches, increasing sync time. To improve performance, try removing columns from your mapping that you don’t need in your database.
- High volumes of changes on either side (Salesforce or database) can also slow synchronization. Connect has to synchronize the entire contents of an object, even if only a small amount of data has changed. For example, if you have a single frequently toggled field on a very large object, try moving that field to a different table to reduce the volume of data that Connect needs to sync.
- Using an ORM that 'touches’ or saves nonexistent changes to records can also cause slowdowns. This will cause a large number of records in your trigger log with a
IGNORED, which slows down the sync process.
- Overall database load can contribute to slower synchronization. Make sure to select a database plan that supports your production traffic alongside Connect’s synchronization queries.
How can I preview new features?
To preview new features before they are GA please contact your CSA or file a support ticket.
Does Heroku Connect have an API?
How can I ensure my own custom Postgres triggers work well with Connect?
Detailed information about using triggers with Heroku Connect can be found in the Using Custom Trigger With Heroku Connect section of the docs.
Is the use of Amazon RDS with Heroku Connect supported?
The use of Amazon RDS with Heroku Connect is not currently supported. Heroku Connect is designed and optimized to work with Heroku Postgres so as to provide the best possible experience across the full application development lifecycle.
I can’t find certain records by their 15-character Salesforce ID
Heroku Connect uses 18 character Salesforce IDs (or
sfids). You may occasionally encounter Salesforce IDs outside of Heroku Connect that are 15 characters long. If you need to search for the corresponding record in your database, use a wildcard search, like this:
SELECT * FROM salesforce.contact WHERE sfid LIKE '0030P00001yr8Fq%';
How many Postgres connections does Heroku Connect require?
Heroku Connect uses 1 connection per mapping and another connection if any of the mappings are in read-write mode. For example, if you have a connection with 20 mappings, 5 of which are in read-write mode, then Heroku Connect will likely use 21 connections to your Postgres database at once.
Can I create more than one Heroku Connect add-on per app/database?
You can have more than one Heroku Connect add-on (a.k.a, connection) per app. Each connection, however, must use a different Postgres schema.
We typically find that Heroku Postgres performance begins degrades around the point when there 50 schema in one database.
Is Heroku Connect compatible with Shield Private Spaces and Shield Postgres?
Yes, there is a Shield Connect plan that meets HIPAA data processor requirements. To provision this plan, you must have both a Shield Private Space and Shield Postgres provisioned and a contractual entitlement in place. To enable access to Shield Connect, please first work with your account representative.
You can choose the Shield plan when adding a Heroku Connect add-on via the Heroku Dashboard. You can also provision it via the CLI:
heroku addons:create herokuconnect:shield -a your_app_name
Note that the standard Heroku Connect plan cannot be provisioned inside of a Shield Private Space.
Can I store tables in my Connect schema that aren’t synchronized with Salesforce?
No. It is not advised to store tables that are not synchronized by Connect in your Connect schema. These tables can pollute your
_trigger_log and cause Connect to slow down while trying to process and ignore those captured changes. Further, if you’re developing against a sandbox organization, you may have to recreate the connection after a sandbox refresh and lose those tables.
If you need an easy way to access tables across multiple schema, you should look into PostgreSQL’s
search_path functionality for your application.