Concurrency and Database Connections in Ruby with ActiveRecord

Last Updated: 09 April 2014

concurrency connection pool database connections

Table of Contents

When increasing concurrency by using a multi-threaded web server like Puma, or multi-process web server like Unicorn, you must be aware of the number of connections your app holds to the database and how many connections the database can accept. Each thread or process requires a different connection to the database. To accommodate this, Active Record provides a connection pool that can hold several connections at a time.

If you have questions about Ruby on Heroku, consider discussing it in the Ruby on Heroku forums.

Connection pool

By default Rails (Active Record) will only create a connection when a new thread or process attempts to talk to the database through a SQL query. Active Record limits the total number of connections per application through a database setting pool; this is the maximum size of the connections your app can have to the database. The default maximum size of the database connection pool is 5. If you try to use more connections than are available, Active Record will block and wait for a connection from the pool. When it cannot get a connection, a timeout error will be thrown. It may look something like this:

ActiveRecord::ConnectionTimeoutError - could not obtain a database connection within 5 seconds. The max pool size is currently 5; consider increasing it

To avoid this error you can change the size of your connection pool manually by customizing your connection settings. While the means are similar, the location of your connect setup can vary for threaded vs. multi-process web servers.

Threaded servers

For servers that achieve concurrency via threads we recommend using an initializer to configure your database pool. When your Rails application boots, it will execute the code in your initializer and establish the connection with your customizations:

#config/initializers/database_connection.rb
Rails.application.config.after_initialize do
  ActiveRecord::Base.connection_pool.disconnect!

  ActiveSupport.on_load(:active_record) do
    config = ActiveRecord::Base.configurations[Rails.env] ||
                Rails.application.config.database_configuration[Rails.env]
    config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
    config['pool']              = ENV['DB_POOL']      || ENV['MAX_THREADS'] || 5
    ActiveRecord::Base.establish_connection(config)
  end
end

If you are using the Puma web server we recommend setting the pool value to equal ENV['MAX_THREADS']. When using multiple processes each process will contain its own pool so as long as no worker process has more than ENV['MAX_THREADS'] then this setting should be adequate.

Multi-process servers

For a forking server such as Unicorn, the master process will boot your rails applications (and execute any initializers) and then fork workers. For this reason it’s necessary to disconnect in your master process in the before_fork and then re-establish the connection in an after_fork block:

# config/unicorn.rb
before_fork do |server, worker|
  # other settings
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.connection.disconnect!
  end
end

after_fork do |server, worker|
  # other settings
  if defined?(ActiveRecord::Base)
    config = ActiveRecord::Base.configurations[Rails.env] ||
                Rails.application.config.database_configuration[Rails.env]
    config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
    config['pool']            =   ENV['DB_POOL'] || 2
    ActiveRecord::Base.establish_connection(config)
  end
end

For Unicorn, this connection setup should be in addition to the normal recommended configuration as described in the Deploying Rails Applications With Unicorn guide.

Here we are setting the pool to 2 connections or the value specified in the DB_POOL env var. We are also setting reaping_frequency which is available in Rails 4 and covered in depth in the “Bad Connections” section. Now you can set the connection pool size by setting a config var on Heroku. For instance if you wanted to set it to 10 you could run:

$ heroku config:set DB_POOL=10

This doesn’t mean that each dyno will now have 10 open connections, but only that if a new connection is needed it will be created until a maximum of 10 have been used per rails process.

Even if you have enough connections in your pool, your database may have a maximum number of connections that it will allow.

Maximum database connections

Heroku provides managed Postgres databases. Different tiered databases have different connection limits. The Starter Tier “Dev” and “Basic” databases are limited to 20 connections. Production Tier databases (plans Crane and up) have higher limits. Once your database has the maximum number of active connections, it will no longer accept new connections. This will result in connection timeouts from your application and will likely cause exceptions.

When scaling out, it is important to keep in mind how many active connections your application needs. If each dyno allows 5 database connections, you can only scale out to four dynos before you need to provision a more robust database.

Now that you know how to configure your connection pool and how to figure out how many connections your database can handle you will need to calculate the right number of connections that each dyno will need.

Calculating required connections

Assuming that you are not manually creating threads in your application code, you can use your web server settings to guide the number of connections that you need. The Unicorn web server scales out using multiple processes, if you aren’t opening any new threads in your application, each process will take up 1 connection. So in your unicorn config file if you have worker_processes set to 3 like this:

worker_processes 3

Then your app will use 3 connections for workers. This means each dyno will require 3 connections. If you’re on a “Dev” plan, you can scale out to 6 dynos which will mean 18 active database connections, out of a maximum of 20. However, it is possible for a connection to get into a bad or unknown state. Due to this we recommend setting the pool of your application to either 1 or 2 to avoid zombie connections from saturating your database. See the “Bad connection” section below.

Another web server, Puma, gets concurrency using threads (16 by default). This means it would require 16 connections in the pool to operate without exception. It’s likely that your dyno isn’t taking full advantage of all 16 of these threads, so with tuning you could figure out an optimal number and specify it in your Procfile. If you wanted Puma to only use 5 threads and therefore 5 maximum connections, you can configure it to use a maximum of 5 threads 0:5 like this:

web:  bundle exec puma  -t 0:5 -p $PORT -e ${RACK_ENV:-development}

Every application will have different performance characteristics and different requirements. To properly tune the number of threads for your app you will need to load test your app in a production-like or staging environment.

Number of active connections

In development you can see the number of connections taken up by your application by checking the database.

$ bundle exec rails dbconsole

This will open a connection to your development database. You can then see the number of connections to your postgres 9.1 or previous database by running:

select count(*) from pg_stat_activity where procpid <> pg_backend_pid() and usename = current_user;

On Postgres 9.2 and later the command is:

select count(*) from pg_stat_activity where pid <> pg_backend_pid()  and usename = current_user;

Which will return with the number of connections on that database:

 count
-------
   5
(1 row)

Since connections are opened lazily, you’ll need to hit your running application at localhost several times until the count quits going up. To get an accurate count you should run that database query inside of a production database since your development setup may not allow you to generate load required for your app to create new connections.

Bad connections

It is possible for connections to hang, or be placed in a “bad” state. This means that the connection will be unusable, but remain open. If you are running a multi-process web server such as Unicorn this could mean that over time a 3 worker dyno which normally consumes 3 database connections could be holding as many as 15 connections (5 default connections per pool times 3 workers). To limit this threat lower the connection pool to 1 or 2 and enable connection reaping which is available in Rails 4.

$ heroku config:set DB_POOL=2

Make sure you’ve got the initializer in your code.

The ActiveRecord reaper is only available in versions 4 and above

config = ActiveRecord::Base.configurations[Rails.env] ||
                Rails.application.config.database_configuration[Rails.env]
config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
config['pool']              = ENV['DB_POOL'] || 5
ActiveRecord::Base.establish_connection(config)

Here 'reaping_frequency' is telling Active Record to check to see if connections are hung or dead every 10 seconds. While it is likely that over time your application may have a few connections that hang, if something in your code is causing hung connections, the reaper will not be a permanent fix to the problem.

Limit connections with PgBouncer

You can continue to scale out your applications with additional dynos until you have reached your database connection limits. Before you reach this point it is recommended to limit the number of connections required by each dyno by using the PgBouncer buildpack.

PGBouncer maintains a pool of connections that your database transactions share. This keeps other wise open and idle connections to Postgres to a minimum. However, transaction pooling prevents you from using named prepared statements, session advisory locks, listen/notify, or other features that operate on a session level. See the PgBouncer buildpack FAQ for full list of limitations for more information.

For many frameworks, you must disable prepared statements in order to use PGBouncer. Then set your app to use a custom buildpack that will call other buildpacks.

Do not continue before disabling prepared statements, or verifying that your framework is not using them. Rails 3+ uses prepared statements.

$ heroku config:add BUILDPACK_URL=https://github.com/ddollar/heroku-buildpack-multi.git

This buildpack will run other buildpacks by looking in the .buildpacks file, and running each buildpack listed in order. So first we will add the PgBouncer buildpack:

$ echo "https://github.com/gregburek/heroku-buildpack-pgbouncer.git#v0.2.2" >> .buildpacks

Next we need to ensure your application can run so you need to add your language specific buildpack. If you are using Ruby it would be:

$ echo "https://github.com/heroku/heroku-buildpack-ruby.git" >> .buildpacks

The final file should look like this:

$ cat .buildpacks
https://github.com/gregburek/heroku-buildpack-pgbouncer.git#v0.2.2
https://github.com/heroku/heroku-buildpack-ruby.git

Now you must modify your Procfile to start PgBouncer. In your Procfile add the command bin/start-pgbouncer-stunnel to the beginning of your web entry. So if your Procfile was

web: bundle exec puma -C config/puma.rb

Will now be:

web: bin/start-pgbouncer-stunnel bundle exec puma -C config/puma.rb

Commit the results to git, test on a staging app, and then deploy to production.

When deploying you should see this in the output:

=====> Detected Framework: pgbouncer-stunnel