Deploying Rails Applications with Unicorn
Last updated 01 February 2016
Heroku recommends using the Puma web server instead of Unicorn. If you are using Unicorn, your application is not protected against a slow client attack.
Web applications that process requests concurrently make much more efficient use of dyno resources than web applications that only process one request at a time. Therefore it is recommended to use concurrent request processing whenever developing and running production services.
The Rails framework was originally designed to process one request at a time. The framework is gradually moving away from this design towards a thread safe implementation that allows for concurrent processing of requests in a single Ruby process. But most Ruby applications don’t support this today.
The Unicorn web server lets you run any Rails application concurrently by running multiple Ruby processes in a single dyno.
This guide will walk you through deploying a new Rails application to Heroku using the Unicorn web server. For basic Rails setup, see Getting Started with Rails.
Always test your new deployments in a staging environment before you deploy to your production application. See Managing Multiple Environments for an App for more info.
The Unicorn server
Unicorn is a Rack HTTP server that uses forked processes to handle multiple incoming requests concurrently.
Adding Unicorn to your application
First, add Unicorn to your app’s Gemfile:
bundle install to set up your bundle locally.
Create a configuration file for Unicorn at
config/unicorn.rb, or at a path of your choosing. For a simple Rails application, we recommend the following basic configuration:
# config/unicorn.rb worker_processes Integer(ENV["WEB_CONCURRENCY"] || 3) timeout 15 preload_app true before_fork do |server, worker| Signal.trap 'TERM' do puts 'Unicorn master intercepting TERM and sending myself QUIT instead' Process.kill 'QUIT', Process.pid end defined?(ActiveRecord::Base) and ActiveRecord::Base.connection.disconnect! end after_fork do |server, worker| Signal.trap 'TERM' do puts 'Unicorn worker intercepting TERM and doing nothing. Wait for master to send QUIT' end defined?(ActiveRecord::Base) and ActiveRecord::Base.establish_connection end
The above assumes a standard Rails app with ActiveRecord and New Relic for monitoring. For information on other available configuration operations, see Unicorn’s documentation.
The environment variable
WEB_CONCURRENCY will be set to a default value based on dyno size. To manually configure this value use
heroku config:set WEB_CONCURRENCY.
Unicorn worker processes
worker_processes Integer(ENV["WEB_CONCURRENCY"] || 3)
Unicorn forks multiple OS processes within each dyno to allow a Rails app to support multiple concurrent requests without requiring them to be thread-safe. In Unicorn terminology these are referred to as worker processes not to be confused with Heroku worker processes which run in their own dynos.
Each forked OS process consumes additional memory. This limits how many processes you can run in a single dyno. With a typical Rails memory footprint, you can expect to run 2-4 Unicorn worker processes. Your application may allow for more or less processes depending on your specific memory footprint, and we recommend specifying this number in an config var to allow for faster application tuning. Monitor your application logs for R14 errors (memory quota exceeded) via one of our logging addons or
preload_app true # ... before_fork do |server, worker| # ... defined?(ActiveRecord::Base) and ActiveRecord::Base.connection.disconnect! end after_fork do |server, worker| # ... defined?(ActiveRecord::Base) and ActiveRecord::Base.establish_connection end
Preloading your application reduces the startup time of individual Unicorn
worker_processes and allows you to manage the external connections of each individual worker using the
after_fork calls. In the config above, these calls are used to correctly establish postgres connections for each worker process.
New Relic also recommends
preload_app true for more accurate data collection with Unicorn apps. For information on using New Relic with
preload_app true, see their documentation.
before_fork do |server, worker| Signal.trap 'TERM' do puts 'Unicorn master intercepting TERM and sending myself QUIT instead' Process.kill 'QUIT', Process.pid end # ... end after_fork do |server, worker| Signal.trap 'TERM' do puts 'Unicorn worker intercepting TERM and doing nothing. Wait for master to send QUIT' end # ... end
POSIX Signals are a form of interprocess communication to indicate a certain event or state change. Traditionally,
QUIT is used to signal a process to exit immediately and produce a core dump.
TERM is used to tell a process to terminate, but allows the process to clean up after itself.
Unicorn uses the
QUIT signal to indicate graceful shutdown. When the master process receives this signal it sends a
QUIT signal to all workers who will then gracefully shut down after completing any in-flight requests. After the worker processes have shut down, the master process will exit.
Heroku uses the
TERM signal to indicate to all processes in a dyno that the dyno is being shut down. The configuration above ensures that this
TERM signal is translated correctly to the Unicorn model: the workers trap and ignore the signal. The master traps and sends a
QUIT signal to itself, thereby starting the graceful shutdown process.
Heroku gives processes 10 seconds to shut down gracefully after which a
KILL signal is sent to all processes to force a shutdown. If an individual request takes longer than 10 seconds, it might be interrupted. Keep an eye out for entries in your application logs that indicate failure to shut down gracefully.
Heroku’s router enforces a 30 second window before there is a request timeout. After a request is delivered to a dyno via the router it has 30 seconds to return a response or the router will return a customizable error page. This is done to prevent hanging requests from tying up resources. While the router will return a response to the client, the unicorn worker will continue to process the request even though a client has received a response. This means that the worker is being tied up, perhaps indefinitely due to a hung request. To ensure your application’s requests do not tie up your dyno past the request timeout, we recommend using both the Rack::Timeout gem and Unicorn’s timeout configuration setting.
Unicorn has a configurable timeout setting. The timeout countdown for unicorn will begin once the request is being processed by your application and end when it returns a response. If the request takes longer than the specified time, the master will SIGKILL the worker working on the request.
We recommend a timeout of 15 seconds. With a 15 second timeout, the master process will send a
SIGKILL to the worker process if processing a request takes longer than 15 seconds. This will generate a H13 error code and you’ll see it in your logs. Note, this will not generate any stacktraces to assist in debugging.
When the Rack::Timeout limit is hit, it closes the requests and generates a stacktrace in the logs that can be used for future debugging of long running code. First, the gem will need to be installed.
# Gemfile gem 'rack-timeout'
$ bundle install
Then the timeout can be set.
# config/initializers/timeout.rb Rack::Timeout.timeout = 10 # seconds
On Ruby 1.9/2.0,
Rack::Timeout uses Ruby’s stdlib
Timeout library which can be unreliable. Heroku recommends using Rack::Timeout and setting the unicorn timeout. If using both timeout systems, the Rack::Timeout value should be lower than the unicorn timeout if you plan on using the stack trace produced by Rack::Timeout for debugging.
Set Unicorn as the server for the web process in your
Procfile, pointing to your config file:
web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb
A sample Rails 4 app using Unicorn is available here:
Preloading and other external services
Take care with other external connections to make sure they work properly with Unicorn’s forking model. As you can see in the sample configuration above, the app drops its ActiveRecord connection in the
before_fork block, and reconnects in the worker process in
after_fork. Other services will follow a similar pattern. For example, here’s the configuration block for using Resque with a Unicorn app:
before_fork do |server, worker| # ... # If you are using Redis but not Resque, change this if defined?(Resque) Resque.redis.quit Rails.logger.info('Disconnected from Redis') end end after_fork do |server, worker| # ... # If you are using Redis but not Resque, change this if defined?(Resque) Resque.redis = ENV['<REDIS_URI>'] Rails.logger.info('Connected to Redis') end end
REDIS_URI config var to correspond with that from your Redis provider.
Many popular gems, such as the dalli memcache client, discuss compatibility with Unicorn’s worker process model in their documentation. If you are experiencing issues, check your gem’s documentation for more information.
For optimal performance, host your assets behind a CDN to free up your web dynos to serve only dynamic content.
Running a concurrent web server in production means that each dyno will require more than one database connection. To run a high volume Rails app with a concurrent web server you will need to understand how Active Record creates and manages these connections in the connection pool, and the connection limit on development databases. For an in depth look at these topics please read the devcenter article Concurrency and Database Connections.
Unicorn may not be the best choice for your particular combination of application and workload. In particular, if your application receives requests with large body payloads from slow clients, you may be better off using a different web server. An example would be a an app that receives images uploaded by users from mobile phones that are not on wifi, 4G or other fast networks.
The problem is caused by Unicorn workers becoming tied up receiving requests that are sent slowly by clients. If all Unicorn workers are tied up, new requests are queued and your app will likely experience greater-than-normal request-queue times or even H12 errors.
Puma, Thin or Rainbows! are alternative web servers that may work
better under load generated by slow clients. To change the web server
running your app on Heroku, simply specify a different command for the
web process type in your