Maximize Worker Utilization with Resque Pool

Last Updated: 11 November 2013

Table of Contents

Heroku runs applications by executing the units capable of work (for example a web server or some worker code) on dynos. These dynos have fixed resources; it is to your benefit as an application owner to maximize the use of available resources without going over them to achieve maximum performance in your applications.

Heroku recommends using a concurrent web server so that you can efficiently handle multiple requests per dyno. Similarly we recommend you use a worker backend that can handle multiple jobs at a time per worker dyno.

Right now there are a few options for running multiple parallel jobs per worker dyno at a time. Sidekiq is one popular option, however it uses threads and requires your code to be threadsafe. If your application is not threadsafe or if you don’t know, it is safer to use processes.

The resque-pool gem scales out your resque based workers by running multiple workers per dyno, and is a good fit. The resque-pool gem behaves similarly to the Unicorn web server in the way it is designed. It boots one master process and then uses the unix fork command to create multiple resque workers that can all process your data independently. These multiple resque workers create your “pool” of workers. Using processes for concurrency is safer in Ruby but can have a higher memory overhead.

The amount of physical RAM varies depending on dyno size. You should run as many workers per dyno as possible without going over this limit. If you go over, the dyno will start swapping to disk and your worker will become very slow.

Install

Assuming you’ve already got a project using Resque you first need to add the resque-pool gem into your Gemfile:

gem 'resque-pool'

Then run bundle install.

Signal configuration

You should set your TERM_CHILD environment variable to 1 to get the signal handling that you expect from Heroku. You can do this in your config:

$ heroku config:set TERM_CHILD=1

You can find more information on this setting in the resque-pool readme and in this resque blog post.

Worker configuration

Once you’ve got resque-pool installed in your machine you will need a YAML file that tells resque-pool how many workers you want to run per each queue. This file is located at config/resque-pool.yml. If you have a job called send_welcome_email and wish to run 5 workers on that queue, and another called crunch_data and want 10 workers dedicated to that queue your config/resque-pool.yml would look like this:

---
send_welcome_email: 5
crunch_data: 10

If you want to specify a group of workers to process from all queues you can use an asterisk *.

If you are using resque-pool 0.4.0(beta) or above you may wish to environment variables in your YAML file so you do not have to deploy to change the number of workers resque is using. If you wanted to configure the total number of workers that process all queues you could do it like this:

In initializers/resque-pool.rb add this code:

WORKER_CONCURRENCY = Integer(ENV["WORKER_CONCURRENCY"] || 5)

Now in your config/resque-pool.yml add this YAML:

---
'*': <%= WORKER_CONCURRENCY %>

Resque-pool 0.4.0(beta) and above will evaluate the ERB correctly and start your workers as configured in your Heroku config.

If you are using resque-pool 0.3.0 or below, ERB is not supported. To dynamically add environment variables we can programmatically write a config file every time the app boots. In initializers/resque-pool.rb add:

WORKER_CONCURRENCY = Integer(ENV["WORKER_CONCURRENCY"] || 5)
RESQUE_POOL_CONFIG = {"*" => WORKER_CONCURRENCY}

File.open(Rails.root.join('config/resque-pool.yml'), 'w') {|f| f.write RESQUE_POOL_CONFIG.to_yaml }

This will generate a new YAML file every time the Rails app is created. To modify the number of workers per group you can change the RESQUE_POOL_CONFIG hash.

Pool setup

If you are using any connections in your application, such as connections to a database you need sure to disconnect and reconnect after the resque pool worker forks. You can ensure this happens with this code in your Rakefile:

# this task will get called before resque:pool:setup
# and preload the rails environment in the pool manager
task "resque:pool:setup" do
  # close any sockets or files in pool manager
  ActiveRecord::Base.connection.disconnect!
  Resque::Pool.after_prefork do
    ActiveRecord::Base.establish_connection
  end
end

Starting

Now that you have declared the number of Resque workers you wish to run in your initializer you can start them locally by running:

$ bundle exec resque-pool

In production you will need this line in your Procfile:

worker: bundle exec resque-pool

Scaling up workers

How many Resque workers can be run in your resque pool? The number is a function of the size of your app and the size of the dyno you are running on. Each new worker consumes extra memory and adds extra processing power. We recommend using the runtime metrics labs feature to determine the amount of memory an active worker dyno running resque-pool is consuming. If the amount of memory is under the maximum size of the dyno you’re using then increase your WORKER_CONCURRENCY. Keep increasing it, however you want to make sure that your app never goes over the memory limit or it will begin to swap and your worker dyno will become very slow.

Concerns

The primary concern with running multiple processes in a dyno is keeping them alive. Heroku can only monitor the process declared directly in your Procfile. If this process crashes we can detect that and restart it. However if that process spins up a child process and that child process crashes we have no way of knowing that or restarting the child process.

Due to this it is important that any multi process programs properly manage themselves. In this case resque-pool will make sure that any crashed processes are restarted and additional processes are reaped. That being said there are potential edge cases such as zombie processes or processes that thrash (continually crash and restart) that it may not deal with.

When transitioning over to resque-pool it is recommended you do so first locally, and then on a staging server so you can better monitor its behavior with your application before you place it into production.