R14 - Memory Quota Exceeded in Ruby (MRI)
Last updated August 31, 2020
Table of Contents
When your Ruby application uses more memory than is available on the Dyno, a R14 - Memory quota exceeded error message will be emitted to your application’s logs. This article is intended to help you understand your application’s memory use and give you the tools to run your application without memory errors.
Why memory errors matter
If you’re getting R14 - Memory quota exceeded errors, it means your application is using swap memory. Swap uses the disk to store memory instead of RAM. Disk speed is significantly slower than RAM, so page access time is greatly increased. This leads to a significant degradation in application performance. An application that is swapping will be much slower than one that is not. No one wants a slow application, so getting rid of R14 Memory quota exceeded errors on your application is very important.
Detecting a problem
Since you’re reading this article it’s likely you already spotted a problem. If not, you can view your last 24 hours of memory use by using Application Metrics on your app’s dashboard. Alternatively, you can check your logs where you will occasionally see the error emitted:
2011-05-03T17:40:11+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
How Ruby memory works
It can help to understand how Ruby consumes memory to be able to decrease your memory usage. For more information, see How Ruby Uses Memory.
Ruby 2.0 upgrade
Upgrading from Ruby 2.0 to 2.1+ introduced generational garbage collection. This means that Ruby 2.1+ applications should run faster, but also use more memory. We always recommend you run the latest released Ruby version, it will have the latest security, bugfix, and performance patches.
If you see a slight increase in memory, you can use the techniques below to decrease your usage to an acceptable level.
A memory leak is defined as memory increasing indefinitely over time. Most applications that have memory problems are defined as having a “memory leak” however if you let those applications run for a long enough period of time, the memory use will level out.
If you believe your application has a memory leak you can test this out. First make sure you can run dynamic benchmarks with derailed. Then you can benchmark RAM use over time to determine if your app is experiencing a memory leak.
Too many workers
Modern Ruby webservers such as Puma allow you to serve requests to your users via concurrent processes. In Puma these are referred to as “worker” processes. In general increasing your workers will increase your throughput, but it will also increase your RAM use. You want to maximize the number of Puma workers that you are using without going over your RAM limit and causing your application to swap.
Too many workers at boot
If your application immediately starts to throw R14 errors as soon as it boots it may be due to setting too many workers. You can potentially fix this by setting your
WEB_CONCURRENCY config var to a lower value.
For example if you have this in your
# config/puma.rb workers Integer(ENV['WEB_CONCURRENCY'] || 2)
You can lower your worker count
$ heroku config:set WEB_CONCURRENCY=1
For some applications two Puma workers will cause you to use more RAM than a
standard-1x dyno can provide. You can still achieve increased throughput with threads when this happens. Alternatively you can upgrade dyno size to run more workers.
Too many workers over time
Your application’s memory use will increase over time. If it starts out fine, but gradually increases to be above your RAM limit, there are a few things you can try. If you quickly hit the limit, you likely want to decrease your total number of Puma workers. If it takes hours before you hit the limit, there is a bandaid you can try called Puma Worker Killer.
Puma worker killer allows you to set up a rolling worker restart of your Puma workers. The idea is that you want to figure out at what interval your application begins using too much memory. You will then schedule your application to restart your workers at that interval. When you restart a process, the memory use goes back to it’s original lower levels. Even if the memory is still growing it won’t cause problems for another few hours, where we would have another restart scheduled.
To use this gem add it to your Gemfile:
$ bundle install and add this to an initializer such as
It’s important to note that this won’t actually fix any memory problems, but instead will cover them up. When your workers restart they cannot serve requests for a few seconds, so when the rolling restarts are triggered your end users may experience a slow down as your overall application’s throughput is decreased. Once restarts are done, throughput should go back to normal.
It is highly recommended that you only use Puma Worker Killer as a stop gap measure until you can identify and fix the memory problem. Several suggestions are covered below.
Forking behavior of Puma worker processes
Puma implements its worker processes via forking. When you fork a program, you copy a running program into a new process and then make changes instead of starting with a blank process. Most modern operating systems allow for memory to be shared between processes in a concept called “copy on write”. When Puma spins up a new worker, it requires very little memory, only when Puma needs to modify or “write” to memory does it copy a memory location from one process to another. Modern Ruby versions are optimized to be copy on write “friendly”, that is they do not write to memory unnecessarily. This means when Puma spins up a new worker it is likely smaller than the one before it. You can observe this behavior locally on Activity Monitor on a Mac or via
ps on Linux. There will be a large process consuming a lot of memory, and then smaller processes. So if Puma with one worker was consuming 300 MB of RAM then using two workers would likely consume less than 600 MB of RAM total.
Too much memory on boot
A common cause of memory use is due to libraries being required in a Gemfile but not used. You can see how much memory your gems use at boot time through the derailed benchmark gem.
First add the gem to your Gemfile:
gem 'derailed', group: :development
$ bundle install and you’re ready to investigate memory use. You can run:
$ bundle exec derailed bundle:mem
This will output the memory use of each of your gems as they are required into memory:
$ derailed bundle:mem TOP: 54.1836 MiB mail: 18.9688 MiB mime/types: 17.4453 MiB mail/field: 0.4023 MiB mail/message: 0.3906 MiB action_view/view_paths: 0.4453 MiB action_view/base: 0.4336 MiB
Remove any libraries you aren’t using. If you see a library using an unusually large amount of memory, try to upgrade to the latest version to see if any issues have been fixed. If the problem persists, open an issue with the library maintainer to see if there is something that can be done to decrease require time memory. To help with this process you can use
$ bundle exec derailed bundle:objects. See Objects created at Require time
in derailed benchmarks for more information.
Too much memory used at runtime
If you’ve cleaned out your unused gems, and you’re still seeing too much memory use, there may be code generating excessive amounts of Ruby objects. It is possible to use a runtime tool such as the Heroku Add-on Scout. Scout published a guide on debugging runtime memory use.
If you don’t use use a tool that can track object allocations at runtime, you can try to reproduce this memory increasing behavior locally with derailed benchmarks by reproducing the allocations locally.
Every application behaves differently so there is no one correct set of GC (garbage collector) values that we can recommend.
When it comes to memory utilization you can control how fast Ruby allocates memory by setting
RUBY_GC_HEAP_GROWTH_FACTOR. This value is different for different versions of Ruby. To understand how it works it is helpful to first understand how Ruby uses memory.
When Ruby runs out of memory and cannot free up any slots via the garbage collector, it has to tell the operating system it needs more memory. Asking the operating system for memory is an expensive (slow) process, so Ruby wants to always ask for a little more than it needs. You can control how much memory it asks for by setting this
RUBY_GC_HEAP_GROWTH_FACTOR config var. For example, if you wanted your application to grow by 3% every time memory was allocated you could set:
$ heroku config:set RUBY_GC_HEAP_GROWTH_FACTOR=1.03
So if your application is 100 MB in size and it needs extra memory to function, with this setting it would ask the OS for 3 MB of RAM extra. This would bring the total amount of memory that Ruby can use to 103 MB. If your memory is growing too quickly try setting this value to smaller numbers. Keep in mind that setting too low of a value can cause Ruby to spend a large amount of time asking the OS for memory.
RUBY_GC_HEAP_GROWTH_FACTOR will only help with R14 errors if you are barely over your dyno memory limits. Or if you’re seeing extremely large “stair-step” memory allocations after your application has been running for several hours. Individual apps are responsible for setting and maintaining their own GC tuning configuration variables.
Excess memory use due to malloc in a multi-threaded environment
Applications created after September of 2019 will have the environment variable MALLOC_ARENA_MAX=2 set.
To work around this malloc behavior, an alternative memory allocator such as jemalloc can be used to replace malloc. To replace malloc on Heroku with jemalloc you can use a third party jemalloc buildpack.
Alternatively, if you don’t want to use a third-party buildpack, it is possible to tune the behavior of glibc memory behavior so that it will consume less memory. However, this approach may impact the performance of the application.
Dyno size and performance
Dynos come in two types, standard dynos which run on a shared infrastructure, and performance dynos which consume an entire runtime instance. When you increase your dyno size you increase the amount of memory you can consume. If your application cannot serve two or more requests concurrently it is subject to request queueing. Ideally, your application should be running in a dyno that allows it to run at least two Puma worker processes. As stated before, additional Puma worker processes consume less RAM than the first process. You may be able to keep application spend the same by upgrading to a larger dyno size, doubling your worker count but using only half the number of total dynos.
While this article is primarily about memory, its focus is speed. On that topic, it is important to highlight that since performance dynos are isolated from “noisy neighbors” they will see much more consistent performance. Most applications perform significantly better when run on performance dynos.