Application Metrics

Last Updated: 19 December 2014

Table of Contents

Application-level metrics help developers investigate and diagnose issues with their applications running on Heroku. An application with two or more running dynos in its dyno formation will have access to metrics gathered for all active process types. This data is available via the Metrics tab of the Heroku Dashboard.

The following documentation outlines the specific characteristics of the data available in the Metrics view tab in the Heroku Dashboard.

Metrics

Individual metrics are gathered per process type.

Values presented in the Metrics view represent aggregates of 10 minute windows, and, for resource utilization, as averages per process type.

Web only

The following metrics are gathered for only the web process type:

Response time

  • Median: The median response time (50th percentile) of HTTP requests within the 10 minute period. This means that 50% of an application’s web requests were completed within less time than the median, and 50% were completed within more.
  • 95th Percentile: The 95th percentile response time of HTTP requests within the 10 minute period. This means that 95% of an application’s web requests were completed within less time, and 5% were completed within more. This is helpful for providing an upper bound – but not maximum – for expected response times.

Throughput

  • OK: The number of successful (status codes < 500) requests serviced per minute. For a 10 minute block, this is the total number of successful requests divided by 10 to provide per-minute values.
  • Failed: The number of failed (status codes >= 500) requests serviced per minute. For a 10 minute block, this is the total number of failed requests divided by 10 to provide per-minute values.

All dynos

The following metrics are gathered for all process types, and are averages of the metrics of the dynos of that process type for a given application:

Memory usage

  • RSS: Average amount of memory (megabytes) held in RAM across dynos of a given process type.
  • Swap: Average amount memory (megabytes) stored on disk (“swapped”) on dynos of a given process type. Swapping is extremely slow and should be avoided.

CPU load

  • 5m Load Average: Average value of the CPU load 5 minute rolling average for the 10 minute block.
  • 5m Load Max: Maximum value of the CPU load 5 minute rolling average for the 10 minute block.

Errors

The number of Heroku Errors that occurred in the 10 minute time window, segmented by error code.

Alerts

In addition to raw metrics, Heroku will provide alerts about specific conditions that can be indicative of problems with your application.

A red indicator will appear on process types that have alerts. A green indicator implies that there are no alerts:

Alt text

The list of alerts provided is constantly evolving as we gather more data about application behavior, but some examples include notifications when:

  • There are request timeouts indicative of slow requests that could lead to application queueing
  • An application’s memory usage is growing in a way that indicates the potential for performance issues
  • Response time has degraded for the same period week-over-week

Metrics retention

Data aggregated for the Metrics reporting functionality is retained on a best effort basis. As the data production and processing volumes are high, and will be increasing as more applications make use of the feature, intermittent data loss may occur.