Skip Navigation
Show nav
Heroku Dev Center
  • Get Started
  • Documentation
  • Changelog
  • Search
  • Get Started
    • Node.js
    • Ruby on Rails
    • Ruby
    • Python
    • Java
    • PHP
    • Go
    • Scala
    • Clojure
  • Documentation
  • Changelog
  • More
    Additional Resources
    • Home
    • Elements
    • Products
    • Pricing
    • Careers
    • Help
    • Status
    • Events
    • Podcasts
    • Compliance Center
    Heroku Blog

    Heroku Blog

    Find out what's new with Heroku on our blog.

    Visit Blog
  • Log inorSign up
View categories

Categories

  • Heroku Architecture
    • Dynos (app containers)
    • Stacks (operating system images)
    • Networking & DNS
    • Platform Policies
    • Platform Principles
  • Command Line
  • Deployment
    • Deploying with Git
    • Deploying with Docker
    • Deployment Integrations
  • Continuous Delivery
    • Continuous Integration
  • Language Support
    • Node.js
    • Ruby
      • Working with Bundler
      • Rails Support
    • Python
      • Background Jobs in Python
      • Working with Django
    • Java
      • Working with Maven
      • Java Database Operations
      • Working with the Play Framework
      • Working with Spring Boot
      • Java Advanced Topics
    • PHP
    • Go
      • Go Dependency Management
    • Scala
    • Clojure
  • Databases & Data Management
    • Heroku Postgres
      • Postgres Basics
      • Postgres Getting Started
      • Postgres Performance
      • Postgres Data Transfer & Preservation
      • Postgres Availability
      • Postgres Special Topics
    • Heroku Redis
    • Apache Kafka on Heroku
    • Other Data Stores
  • Monitoring & Metrics
    • Logging
  • App Performance
  • Add-ons
    • All Add-ons
  • Collaboration
  • Security
    • App Security
    • Identities & Authentication
    • Compliance
  • Heroku Enterprise
    • Private Spaces
      • Infrastructure Networking
    • Enterprise Accounts
    • Enterprise Teams
    • Heroku Connect (Salesforce sync)
      • Heroku Connect Administration
      • Heroku Connect Reference
      • Heroku Connect Troubleshooting
    • Single Sign-on (SSO)
  • Patterns & Best Practices
  • Extending Heroku
    • Platform API
    • App Webhooks
    • Heroku Labs
    • Building Add-ons
      • Add-on Development Tasks
      • Add-on APIs
      • Add-on Guidelines & Requirements
    • Building CLI Plugins
    • Developing Buildpacks
    • Dev Center
  • Accounts & Billing
  • Troubleshooting & Support
  • Integrating with Salesforce
  • App Performance
  • Optimizing Dyno Usage

Optimizing Dyno Usage

English — 日本語に切り替える

Last updated April 04, 2022

Table of Contents

  • Considering different dyno types
  • When to try a different dyno size
  • Basic methodology for optimizing memory
  • Concurrent web servers
  • Measuring

A fundamental aspect to optimizing any application is to ensure it is architected appropriately. For example, it should use background jobs for computationally intensive tasks in order to keep request times short, and use a process model to ensure that separate parts of the application can be scaled independently.

Beyond this, you may reach a point where you need to scale or optimize by making more efficient use of available resources. For example, if your web requests are short and handled efficiently, you could be able to increase throughput on a dyno by increasing the ability of the web server to handle more requests concurrently, usually at the expense of using more RAM.

This article provides a bird’s-eye view of how to go about optimizing an application for the various dyno types. It provides some rough estimates of capabilities, and pays particular attention to memory usage and concurrency. The techniques suggested in this article are relevant to any environment that runs your application, not just a dyno.

Heroku Enterprise customers with Premier or Signature Success Plans can request in-depth guidance on this topic from the Customer Solutions Architecture (CSA) team. Learn more about Expert Coaching Sessions here or contact your Salesforce account executive.

Considering different dyno types

Heroku offers a range of dyno types. Each type has a different CPU and RAM profile.

Changing the dyno type of an application increases complexity: as a developer you have introduced a new variable, the type of the dyno, in addition to the number of dynos.

However, a well designed app will quite naturally be able to make use of different dyno types, and thinking about optimizing your application to make better use of a dyno is a worthwhile endeavor.

Even if your application doesn’t need to make use of different dyno types, consider applying these optimization techniques to your current dyno type anyway.

The different dyno types offer three important axes of optimization: CPU, RAM and the performance profile.

CPU

Most applications are not CPU-bound on the web server.

If you are processing individual requests slowly due to CPU or other shared resource constraints (such as database), then optimizing concurrency on the dyno may not help your application’s throughput at all. Put another way, if your application is slow when there is little traffic, the techniques in this article may not increase performance.

The different dyno types do offer different CPU performance characteristics, and will aid a little in a high-CPU situations, but ideally you should consider offloading work to a background worker as a first step in optimization, as well as optimizing the code.

A final aspect of CPU is the number of cores. The different dyno types, performance in particular, offer multiple cores. With multiple cores, you may be able to execute multiple threads in parallel. This article points out where you need to take action to make use of these cores.

The rest of this article will assume the application is not CPU-bound.

RAM

Depending on language and web framework, there is typically a direct correspondence between RAM and concurrency.

For example, web servers like Unicorn for Ruby, or Gunicorn for Python, pre-fork a number of identical copies of your web servers (called workers). Unicorn then has its own connection queue, and as workers finish a web request, they pull a new request off of the queue.

Having more RAM in this scenario means that you can have more workers running concurrently - and there is typically a fairly linear correlation between RAM and concurrency. Optimizing concurrency for RAM is something this article addresses.

Performance profile

The performance profile of each dyno type can have an impact. In particular, free, hobby, standard-1x and standard-2x dynos operate on a CPU-share basis, whereas performance dynos are single tenant.

These performance dynos therefore offer a higher level of resource isolation.

This can have a significant impact on applications, depending on the amount of traffic that they’re receiving and how well they’re optimized. In particular, a more consistent performance profile can lead to reduced tail latencies.

When to try a different dyno size

There are many factors that come into play when considering different dyno types. Some of them are inherent to your application (how much CPU does it use), some are due to optimization factors introduced by increased concurrency (due to having more RAM) and some due to the inherent characteristics of the dyno itself.

This complexity can be difficult to navigate, but the simple techniques suggested in this article for applications that are not CPU bound can be found make it a lot more tractable and easy to optimize for any dyno type.

Once you have optimized for a particular dyno type, say standard-1x dynos, apply the same techniques on a standard-2x, performance-m or performance-l dyno - taking into account the factors that each dyno type introduce.

Here are some rough rules of thumb:

  • For most applications that aren’t receiving tremendously high volumes of traffic, consider standard-1x dynos.
  • If the application is particularly memory-hungry, as seen in some Java-based frameworks such as Play and JRuby, consider standard-2x dynos which doubles the memory.
  • For very high volume web apps, running on more than 20 standard-1x dynos, consider performance-m or performance-l dynos.

Basic methodology for optimizing memory

We suggest that you follow these steps, making use of visibility tools listed below, as well as the per-language suggestions. This will get you to a point where you can easily optimize for a single dyno type, or for moving between dyno types.

  1. Use a concurrent web server.
  2. Set up instrumentation to measure the impact of load on the app.
  3. Observe the app’s performance, and adjust the concurrency as necessary.

Optimizing is an iterative process - there is no golden path. Different languages, web frameworks and applications behave quite differently.

For example, a standard Ruby application may need to use a web server that forks multiple copies of an application to make use of all the RAM that is available. A standard Java application, on the other hand, may simply need a parameter to the JVM in order to allocate a larger heap.

Concurrent web servers

Different languages and platforms have different approaches to concurrency. Here’s a brief look at how to establish concurrency in apps running on Ruby, Java, Python and Node.js.

Ruby

To see how you can optimize your application please refer to the comprehensive R14 - Memory Quota Exceeded in Ruby (MRI) article. It covers common problems for memory bloat in a Ruby application as well as several diagnostic tools and techniques for finding and correcting increased memory use in a Ruby application. Concurrency and Database Connections in Ruby with ActiveRecord is a great resource for evaluating how to factor in best practices for database connections to maximize concurrency, too.

JRuby

JRuby servers like Puma make good use of concurrency without the need for multiple processes. However, you will need to tune the amount of memory allocated to the JVM, depending on the dyno type. The Ruby buildpack defines sensible defaults, which can be overridden by setting either JAVA_OPTS or JRUBY_OPTS.

Java, Scala, Clojure

Java web servers like Jetty, Tomcat and Netty make good use of concurrency out of the box. However, you will need to tune the amount of memory allocated to the JVM, depending on the dyno type.

Read Adjusting Environment for a Dyno Size for appropriate JAVA_OPTS flags to accomplish this.

Python

If you want to optimize for increased concurrency, Heroku recommends that you use Gunicorn for Python apps.

Gunicorn works by forking a configurable number of child processes, called workers. Each worker can only process a single request at a time. Concurrency comes about because the master Gunicorn process queues new web requests, and these are then delegated to workers if they are free and have completed processing a previous request.

Increasing the concurrency is then configured by increasing the number of workers.

However, as each worker is effectively a forked version of your application, moving from a single worker to two workers will roughly double the memory requirements of your application.

It’s this trade off - between increased concurrency and memory available in a dyno, that you will measure and tune.

Read Deploying Python Applications with Gunicorn to learn how to set up Gunicorn for Python on Heroku. This will result in a web server with a config var, WEB_CONCURRENCY, which will let you adjust the number of workers the main Unicorn process will fork.

While highly app dependent, the following table lists some rough rules of thumb for how many Unicorn workers can be run on each dyno type:

Dyno TypeNumber of Gunicorn workers
free, hobby, standard-1x2-3
standard-2x4-6
performance-m8-12
performance-l20-30

These are just estimates, and will vary from app to app. Use something in the lower range, measure, and adjust as necessary.

For Django-specific recommendations, see Concurrency and Database Connections in Django.

Node.js

Node offers a single-threaded, non-blocking process model. To take advantage of multiple cores, Node must use the Cluster API to fork multiple concurrent processes. Even if you don’t plan on using concurrency today, we recommend enabling Cluster in your app so that it can scale to a variety of containers.

Read Optimizing Node.js Concurrency to learn how to configure concurrency through Node’s Cluster API on Heroku.

PHP

Applications using the PHP or HHVM runtimes automatically adjust their number of worker processes or threads depending on the type of dyno they run on. The main factor to decide the number of processes or threads is the PHP memory limit that’s configured for an application.

Please refer to Optimizing PHP Application Concurrency for more information on tuning PHP applications for maximum throughput.

Measuring

After setting up a concurrent web server, you’ll want to tune it for a particular dyno type. Measuring memory and throughput should provide enough guidance for you to make a judgement as to the impact of a change.

Measuring memory with log-runtime-metrics

The Heroku Labs log-runtime-metrics feature adds support for enabling visibility into load and memory usage for running dynos. Dynos in Private Spaces always emit runtime metrics to the log stream, and will not need to have this feature turned on.

Per-dyno stats on memory use, swap use, and load average are inserted into the app’s log stream.

Here is some example output with this feature enabled:

source=web.1 dyno=heroku.2808254.d97d0ea7-cf3d-411b-b453-d2943a50b456 sample#load_avg_1m=2.46 sample#load_avg_5m=1.06 sample#load_avg_15m=0.99
source=web.1 dyno=heroku.2808254.d97d0ea7-cf3d-411b-b453-d2943a50b456 sample#memory_total=21.00MB sample#memory_rss=21.22MB sample#memory_cache=0.00MB sample#memory_swap=0.00MB sample#memory_pgpgin=348836pages sample#memory_pgpgout=343403pages

The memory_rss is the most significant number here, providing an indication of total resident memory. Ensure that you don’t exceed the memory of your dyno type - and leave some head room too. Likewise, make sure you keep swap usage at minimum and the swapping activity (memory_pgpgin/memory_pgpgout) is minimal. Ideally memory_pgpgin/memory_pgpgout shouldn’t change much over time (rate of change is zero).

See log-runtime-metrics to understand how to interpret these figures.

The output of log-runtime-metrics is particularly useful as it lets you look at per-dyno memory usage. If you’re over-provisioned, you may see a single dyno peaking before any other.

There are other ways of visualizing this memory data:

The Librato add-on, with the Nickel plan and above, provides a way to graph the various output from log-runtime-metrics, averaging the values across all the dynos.

Here is sample output for a Rails application on standard-1x dynos using 4 Unicorn workers. The memory, about 359MB at a peak, fits comfortably into the standard-1x 512MB of RAM.

Measuring throughput and response time

Throughput, the number of requests being handled per minute, as well as response times, are particularly useful indicators of how an optimization has affected the performance of a dyno.

In particular, the 95th and 99th percentile response time values provided by add-ons like Librato or New Relic should be monitored closely.

Keep reading

  • App Performance

Feedback

Log in to submit feedback.

Worker Dynos, Background Jobs and Queueing Scheduled Jobs and Custom Clock Processes

Information & Support

  • Getting Started
  • Documentation
  • Changelog
  • Compliance Center
  • Training & Education
  • Blog
  • Podcasts
  • Support Channels
  • Status

Language Reference

  • Node.js
  • Ruby
  • Java
  • PHP
  • Python
  • Go
  • Scala
  • Clojure

Other Resources

  • Careers
  • Elements
  • Products
  • Pricing

Subscribe to our monthly newsletter

Your email address:

  • RSS
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku Blog
    • Heroku News Blog
    • Heroku Engineering Blog
  • Heroku Podcasts
  • Twitter
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku
    • Heroku Status
  • Facebook
  • Instagram
  • Github
  • LinkedIn
  • YouTube
Heroku is acompany

 © Salesforce.com

  • heroku.com
  • Terms of Service
  • Privacy
  • Cookies
  • Cookie Preferences