Last updated 29 September 2015
Application-level metrics help developers investigate and diagnose issues with their applications running on Heroku.
Application metrics are only available to apps that are using professional dyno types, which are
performance. Applications using
hobby dynos do not have access to application metrics.
To view application metrics, navigate to your app in the Heroku Dashboard and click the Metrics tab. Plot views can be toggled between the default horizontal stacked layout and a compact multi-axes stacked layout.
Individual metrics are gathered per process type.
Values presented in the Metrics view represent aggregates of 10 minute windows, and, for resource utilization, as maxima or averages per process type.
Metrics gathered for web dynos only
The following metrics are gathered for only the
web process type:
- Median: The median response time (50th percentile) of HTTP requests within the 10 minute period. This means that 50% of an application’s web requests were completed within less time than the median, and 50% were completed within more.
- 95th Percentile: The 95th percentile response time of HTTP requests within the 10 minute period. This means that 95% of an application’s web requests were completed within less time, and 5% were completed within more. This is helpful for providing an upper bound (but not maximum) for expected response times.
- OK: The number of successful (status codes < 500) requests serviced per minute. For a 10 minute block, this is the total number of successful requests divided by 10 to provide per-minute values.
- Failed: The number of failed (status codes >= 500) requests serviced per minute. For a 10 minute block, this is the total number of failed requests divided by 10 to provide per-minute values.
Metrics gathered for all dynos
The following metrics are gathered for all process types, and are averages of the metrics of the dynos of that process type for a given application:
Maximum overall memory usage is displayed as a single stacked plot, combining maximum rss and maximum swap memory as reported for 10 minute increments. Mean total memory (rss + swap) is shown as a dashed line.
- RSS: The amount of memory (megabytes) held in RAM across dynos of a given process type. Max RSS is reported for each 10 minute interval.
- Swap: The portion of a dyno’s memory, in megabytes, stored on disk. It’s normal for an app to use a few megabytes of swap per dyno. Higher levels of swap usage though may indicate too much memory usage when compared to the dyno size. This can lead to slow response times and should be avoided. Max swap is reported for each 10 minute interval.
- Total Memory: Mean total memory represents the portion of memory which user’s can optimize and is shown as the sum of rss and swap as measured in 10 minute increments and averaged across all dynos.
- 1m Load Average: The mean of the 1 minute load average for each 10 minute period. This reflects the number of CPU tasks that are in the ready queue (i.e. waiting to be processed) expressed as an exponentially dampened average over the past 30 minutes.
- 1m Load Max: Maximum value of the 1 minute load average for the 10 minute period.
The number of Heroku Errors that occurred in the 10 minute time window, segmented by error code. Critical and warning level errors are displayed in red, with informational errors in gray.
Data aggregated for metrics reporting is retained on a best effort basis. Loss of the aggregated data may occur.
In addition to raw metrics, Heroku will provide alerts about specific conditions that might be indicative of problems with your application.
A red indicator will appear on process types that have alerts. A green indicator means that there are no alerts.
The list of alerts provided is constantly evolving as we gather more data about application behavior, but some examples include notifications when:
- There are request timeouts indicative of slow requests that could lead to application queueing.
- An application’s memory usage is growing in a way that indicates the potential for performance issues.
- Response time has degraded for the same period week-over-week.