Heroku Postgres Performance Analytics
Last updated March 27, 2024
Table of Contents
Performance analytics is the visibility suite for Heroku Postgres. It enables you to monitor the performance of your database and diagnose potential problems. It consists of several components:
Expensive Queries
The leading cause of poor database performance is unoptimized queries. The list of your most expensive queries, available through data.heroku.com, helps to identify and understand the queries that take the most time in your database. For more information, see the Expensive Queries article.
Logging
If your application or framework emits logs on database access, you can retrieve them through Heroku’s log-stream:
$ heroku logs -t
To see logs from the database service itself, use heroku logs
with the process type -p postgres
option to see only the logs from Postgres:
$ heroku logs -p postgres -t
To read more about Heroku Postgres log statements, see Understanding Heroku Postgres Log Statements and Common Errors.
pg:diagnose
pg:diagnose
performs a number of useful health and diagnostic checks that help analyze and optimize database performance. It produces a report that you can share with teammates or Heroku Support.
Carefully consider the impact to your database and application before making any changes.
pg:diagnose
isn’t supported on shield
plans.
$ heroku pg:diagnose --app sushi
Report 1234abc… for sushi::HEROKU_POSTGRESQL_MAROON_URL
available for one month after creation on 2014-07-03 21:29:40.868968+00
GREEN: Connection Count
GREEN: Long Queries
GREEN: Long Transactions
GREEN: Idle in Transaction
GREEN: Indexes
GREEN: Bloat
GREEN: Hit Rate
GREEN: Blocking Queries
GREEN: Sequences
GREEN: Unlogged Tables
GREEN: Table Transaction ID Wraparound
GREEN: Database Transaction ID Wraparound
GREEN: Schema Count
GREEN: Load
GREEN: CPU Burst
Check: Connection Count
Each Postgres connection requires memory, and database plans have a limit on the number of connections they can accept. If you’re using too many connections, consider using a connection pooler such as PgBouncer or migrating to a larger plan with more RAM.
Checks: Long Running Queries, Long Transactions, Idle in Transaction
Long-running queries and transactions can cause problems with bloat that prevent auto vacuuming and cause followers to lag behind. The reporting threshold for these queries and transactions is 1 minute (60 seconds).
They also create locks on your data, which can prevent other transactions from running. Consider killing long-running queries with pg:kill
.
Backends that are idle in transaction
are backends that are waiting on the client to finish the transaction, either through COMMIT;
or ROLLBACK
. Clients that improperly disconnect can leave backends in this state. If left open, you can terminate them with pg:kill --force
.
Check: Indexes
The Indexes check includes three classes of indexes.
Never Used Indexes haven’t been used (since the last manual database statistics refresh). These indexes are typically safe to drop, unless they are in use on a follower.
Low Scans, High Writes indexes are used, but infrequently relative to their write volume. Indexes are updated on every write, so are especially costly on a high write table. Consider the cost of slower writes against the performance improvements that these indexes provide.
Seldom used Large Indexes aren’t often used, but take up significant space both on disk and in cache (RAM). These indexes can still be important to your application. For example, if they’re used by periodic jobs or infrequent traffic patterns.
Index usage is only tracked on the database receiving the query. If you use followers for reads, this check doesn’t account for usage made against the follower and is likely inaccurate.
Check: Bloat
Because Postgres uses MVCC, old versions of updated or deleted rows are made invisible rather than modified in place. Under normal operation an autovacuum process goes through and asynchronously cleans these up. Sometimes, however, it can’t work fast enough or prevent some tables from becoming bloated. High bloat can slow down queries, waste space, and even increase load as the database spends more time looking through dead rows.
You can manually vacuum a table with the VACUUM (VERBOSE, ANALYZE);
command in psql
. If this occurs frequently, you can make autovacuum more aggressive.
The bloat estimation calculation isn’t always accurate for tables that use columns that don’t have column statistics,
such as json
columns.
Check: Hit Rate
Checks the overall index hit rate, the overall cache hit rate, and the individual index hit rate per table. Databases with lower cache hit rates perform significantly worse because they have to hit disk instead of reading from memory. Consider migrating to a larger plan for low cache hit rates, and adding appropriate indexes for low index hit rates.
The overall cache hit rate is calculated as a ratio of table data blocks fetched from the Postgres buffer cache against the sum of cached blocks and uncached blocks read from disk. On larger plans, the cache hit ratio can be lower but performance remains constant, as the remainder of the data is cached in memory by the OS rather than Postgres.
The overall index hit rate is calculated as a ratio of index blocks fetched from the Postgres buffer cache against the sum of cached indexed blocks and uncached index blocks read from disk. On larger plans, the index hit ratio can be lower, but performance remains constant because the remainder of the index data is cached in memory by the OS rather than Postgres.
The individual index hit rate per table is calculated as a ratio of index scans against a table versus the sum of sequential scans and index scans against the table.
Check: Blocking Queries
Some queries can take locks that block other queries from running. Normally these locks are acquired and released quickly and don’t cause any issues. In pathological situations however some queries can take locks that cause significant problems if held too long. Consider killing the query with pg:kill
.
Check: Sequences
This command looks at 32-bit integer
(aka int4) columns that have associated sequences, and reports columns that are getting close to the maximum value for 32-bit ints. Migrate reported columns to 64-bit bigint
(aka int8) columns to avoid overflow. An example of such a migration is alter table products alter column id type bigint;
. Changing the column type can be an expensive operation, especially on large tables. Make sure you have a migration plan.
Aside from composite indexes, you don’t need to prefer integer
columns over bigint
columns on Heroku Postgres for alignment on 64-bit systems.
This check is skipped if there are more than 100 integer
(int4) columns.
Check: Unlogged Tables
This check looks for tables that have been created as UNLOGGED
. Unlogged tables can be used for fast data loading, but they aren’t written to the write-ahead log (WAL). Data written to these tables aren’t covered by Continuous Protection. The data also isn’t replicated to high-availability standby or follower databases, which leads to data loss when maintenance takes place. To prevent data loss, move the data to logged tables in batches if the tables are large. If the tables are small, use ALTER TABLE <table> SET LOGGED
to change unlogged tables to logged tables.
Check: Table Transaction ID Wraparound, Database Transaction ID Wraparound
These checks determine how close individual tables are, or a database is, to a transaction ID wraparound. Transaction ID wraparound is a rare scenario caused bt autovacuum operations being unable to keep up on frequently updated tables. These tables are in danger of the transaction ID for that table, or database, wrapping around and resulting in data loss. To prevent transaction ID wraparound, Postgres prevents new writes cluster wide until the issue is resolved, impacting availability. These checks return the table and database names if over 50% of the transaction ID space has been used.
Check: Schema Count
This check counts the number of schema in the database. It reports a yellow warning if there are more than 19 schema, and a red warning if there are more than 50 schema. A large number of schema can impact Postgres performance and the ability to take successful logical backups. We recommend you maintain no more than 50 schema.
Check: Load
There are many reasons load can be high on a database: bloat, CPU intensive queries, index building, and too much activity on the database. Review your access patterns, and consider migrating to a larger plan with a more powerful processor.
Check: CPU Burst
CPU Burst checks only apply to standard-0
, premium-0
, private-0
, and shield-0
plans.
Heroku Postgres plans that have Burstable vCPU Performance use a “bucket” of CPU resources when utilization is beyond the baseline of ~20%. This check returns:
GREEN
when there are plenty of burstable resources available, no action is needed.YELLOW
when resource utilization is above baseline, and currently bursting. You might need to reduce resource utilization.RED
when burstable resources are exhausted, and performance reduction to the baseline is imminent or happening now. Action is required to prevent performance degrading, up to and including upgrading Heroku Postgres plans.
Resetting Statistics
You can reset the internal Postgres statistics to make it easier to see the effects of changes.