Heroku Labs: log-runtime-metrics

Last updated February 11, 2025

Enabling
Disabling
How it works
Log format
CPU load averages
Memory & swap
Understanding load averages

This Heroku Labs log-runtime-metrics feature adds experimental support for enabling visibility into load and memory usage for running dynos. Per-dyno stats on memory use, swap use, and load average are inserted into the app’s log stream where they can be seen via heroku logs --tail, used for graphs or alerting via an add-on which consumes app logs, or sent to a log drain. There is no cost incurred by enabling this feature.

Features added through Heroku Labs are experimental and subject to change.

This feature is not yet available for Fir-generation apps. Subscribe to our changelog to stay informed of when we add this feature to Fir.

Enabling

$ heroku labs:enable log-runtime-metrics --app example-app
Enabling log-runtime-metrics for example-app... done
$ heroku restart

Disabling

$ heroku labs:disable log-runtime-metrics --app example-app
Disabling log-runtime-metrics for example-app... done
$ heroku restart

You may need to perform a deploy to fully disable the feature. You can do so with an empty commit: git commit --allow-empty -m "Force Heroku build." followed by a git push.

How it works

The load and memory usage metrics are surfaced as system logs in the Logplex log stream. Metrics are emitted for each running dyno, at an approximate frequency of once every 20 seconds.

Log format

Runtime metrics logs have the following format:

source=web.1 dyno=heroku.2808254.d97d0ea7-cf3d-411b-b453-d2943a50b456 sample#load_avg_1m=2.46 sample#load_avg_5m=1.06 sample#load_avg_15m=0.99
source=web.1 dyno=heroku.2808254.d97d0ea7-cf3d-411b-b453-d2943a50b456 sample#memory_total=21.00MB sample#memory_rss=21.22MB sample#memory_cache=0.00MB sample#memory_swap=0.00MB sample#memory_pgpgin=348836pages sample#memory_pgpgout=343403pages

The source field identifies a dyno in your dyno formation and is intended to be used by systems draining application logs. The dyno field includes the app id and a UUID that unique identifies every distinct dyno run on the platform. Over the life of your app metrics, you will see the same source value have many different dyno values, related to when you deploy or restart dynos.

The order of the key-value pairs of the metrics emitted to your logs isn’t guaranteed.

CPU load averages

The following fields are reported for CPU load average:

Load Average 1m (load_avg_1m): The load average for the dyno in the last 1 minute. This reflects the number of CPU tasks that are in the ready queue (i.e. waiting to be processed). More details about how load averages are calculated can be found below.
Load Average 5m (load_avg_5m): The load average for the dyno in the last 5 minutes. Computed in the same manner as 1m load average.
Load Average 15m (load_avg_15m): The load average for the dyno in the last 15 minutes. Computed in the same manner as 1m load average.

Memory & swap

The following fields are reported for memory consumption and swap:

Resident Memory (memory_rss): The portion of the dyno’s memory (megabytes) held in RAM.
Disk Cache Memory (memory_cache): The portion of the dyno’s memory (megabytes) used as disk cache.
Swap Memory (memory_swap): The portion of a dyno’s memory, in megabytes, stored on disk. It’s normal for an app to use a few megabytes of swap per dyno. Higher levels of swap usage though may indicate too much memory usage when compared to the dyno size. This can lead to slow response times and should be avoided.
Total Memory (memory_total): The total memory (megabytes) being used by the dyno, equal to the sum of resident, cache, and swap memory.
Memory Quota (memory_quota): The resident memory (memory_rss) value (megabytes) at which an R14 is triggered.
Pages Written to Disk (memory_pgpgout): The cumulative total of the pages written to disk. Sudden high variations on this number can indicate short duration spikes in swap usage. The other memory related metrics are point in time snapshots and can miss short spikes.
Pages Read from Disk (memory_pgpgin): The cumulative total of the pages read from disk. As with the previous metric, watch out for sudden variations.

Understanding load averages

The dyno manager takes the count of runnable tasks for each dyno roughly every 20 seconds. A runnable task is a process or thread that is either currently running on a CPU or is waiting for a CPU to run on, but otherwise has all the resources it needs to run. An exponentially damped moving average is computed with the count of runnable tasks in the following iterative algorithm:

expterm = Math.exp(-(count_of_runnable_tasks.time - avg.time) / (period))
newavg = (1 - expterm) * count_of_runnable_tasks.value + expterm * avg.value

where period is either 1-, 5-, or 15-minutes (in seconds), the count_of_runnable_tasks is an entry of the number of tasks in the queue at a given point in time, and the avg is the previous calculated exponential load average from the last iteration.

Heroku’s dyno load calculation differs from LINUX’s load calculation as it does not include processes in uninterruptible sleep, making it similar to traditional UNIX load calculations.

Keep reading

Heroku Labs

Categories

Heroku Labs: log-runtime-metrics

Table of Contents

Enabling

Disabling

How it works

Log format

CPU load averages

Memory & swap

Understanding load averages

Keep reading

Feedback