Using Celery on Heroku

Last Updated: 07 May 2015

Table of Contents

Celery is a framework for performing asynchronous tasks in your application. Celery is written in Python and makes it very easy to offload work out of the synchronous request lifecycle of a web app onto a pool of task workers to perform jobs asynchronously.

Celery is fully supported on Heroku and just requires using one of our add-on providers to implement the message broker and result store.

Architecture

You deploy Celery by running one or more worker processes. These processes connect to the message broker and listen for job requests. The message broker distributes job requests at random to all listening workers. To originate a job request, you just invoke code in the Celery library which takes care of marshaling arguments to the job and publishing the job request to the broker. Once the worker has completed the task code you just invoke code in the Celery library which takes care of marshaling arguments to the job and publishing the job request to the broker.

You can flexibly scale your worker pool by simply running more worker processes which connect to the broker. Each worker can execute jobs in parallel with every other worker.

Choosing a broker

Heroku supports lots of great choices for your Celery broker via add-ons provided by our partner companies. Generally speaking, the broker engines with the best support within Celery include Redis and RabbitMQ. Others including Amazon SQS, IronMQ, MongoDB, and CouchDB are also supported, though some features may be missing when using these brokers. The way Celery abstracts away the specifics of broker implementations make changing brokers relatively easy, so if at some later point you decide a different broker better suits your needs, feel free to switch.

Once you’ve chosen a broker, create your Heroku app and attach the add-on to it:

$ heroku apps:create
$ heroku addons:create redisgreen

Celery ships with a library to talk to RabbitMQ, but for any other broker, you’ll need to install its library. For example, when using Redis:

$ pip install py-redis

Choosing a result store

If you need Celery to be able to store the results of tasks, you’ll need to choose a result store. If not, skip to the next section. Characteristics that make a good message broker do not necessarily make a good result store! For instance, while RabbitMQ is the best supported message broker, it should never be used as a result store since it will drop results after being asked for them once. Both Redis and Memcache are good candidates for result stores.

If you choose the same result store as message broker, you do not need to attach 2 add-ons. If not, make sure the result store add-on is attached.

Creating a Celery app

First, make sure Celery itself is installed:

$ pip install celery

Then create a Python module for your celery app. For small celery apps, it’s common to put everything together in a module named tasks.py:

import celery
app = Celery('example')

Defining tasks

Now that you have a Celery app, you need to tell the app what it can do. The basic unit of code in Celery is the task. This is just a Python function that you register with Celery so that it can be invoked asynchronously. Celery has various decorators which define tasks, and a couple of method calls for invoking those tasks. Add a simple task to your tasks.py module:

@app.task
def add(x, y):
    return x + y

You’ve now created your first Celery task! But before it can be run by a worker, a bit of configuration must be done.

Configuring a Celery app

Celery has many configuration options, but to get up and running you only need to worry about a couple:

  • BROKER_URL: The URL that tells Celery how to connect to the message broker. (This will commonly be supplied by the add-on chosen to be the broker.)
  • CELERY_RESULT_BACKEND: A URL in the same format as BROKER_URL that tells Celery how to connect to the result store. (Ignore this setting if you choose not to store results.)

Heroku add-ons provide your application with environment variables which can be passed to your Celery app. For example:

import os
app.conf.update(BROKER_URL=os.environ['REDISGREEN_URL'],
                CELERY_RESULT_BACKEND=os.environ['REDISGREEN_URL'])

Your Celery app now knows to use your chosen broker and result store for all of the tasks you define in it.

Running locally

Before running celery workers locally, you’ll need to install the applications you’ve chosen for your message broker and result store. Once installed, ensure both are up and running. Then create a Procfile which Foreman can use to launch a worker process. Your Procfile should look something like:

worker: celery worker --app=tasks.app

Now add a file named .env to tell Foreman which environment variables to set. For example, if using Redis with its default configuration for both the message broker and result store, simply add an environment variable with the same name as the one provided by the add-on:

REDISGREEN_URL=redis://

To start your worker process, run Foreman:

$ foreman start

Then, in a Python shell, run:

import tasks
tasks.add.delay(1, 2)

delay tells Celery that you want to run the task asynchronously rather than in the current process.

You should see the worker process log that it has received and executed the task.

Deploying on Heroku

If you already created a Procfile above and attached the appropriate add-ons for the message broker and result store, all that’s left to do is push and scale your app:

$ git push heroku master
$ heroku ps:scale worker=1

Of course, at any time you can scale to any number of worker dynos. Now run a task just like you did locally:

$ heroku run python

>>> import tasks
>>> tasks.add.delay(1, 2)

You should see the task running in the application logs:

$ heroku logs -t -p worker

Celery and Django

The Celery documentation provides a good overview of how to integrate Celery with Django. Though the concepts are the same, Django’s reusable app architecture lends itself well to a module of tasks per Django app. You may want to leverage the INSTALLED_APPS setting in concert with Celery’s ability to autodiscover tasks:

from django.conf import settings
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)

Celery will then look for a tasks.py file in each of your Django apps and add all tasks found to its registry.

Celery best practices

Managing asynchronous work

Celery tasks run asynchronously, which means that the Celery function call in the calling process returns immediately after the message request to perform the task is sent to the broker. There are two ways to get results back from your tasks. One way is just to write the results of the task into some persistent storage like your database.

The other way to get results back from a Celery task is to use the result store. The Celery result store is optional. Oftentimes it is sufficient to issue tasks and let them run asynchronously and store their results to the database. In that case you can safely ignore the result as saved by Celery.

Choosing a serializer

Everything that gets passed into a Celery task and everything that comes out as a result must be serialized and deserialized. Serialization brings with it a set of problems that are good to keep in mind as you design your Celery tasks.

By default, Celery chooses to use pickle to serialize messages. Pickle has the benefit of “just working” out of the box, but it can cause many problems down the road. When code changes, in-flight serialized objects can cause problems when deserialized, still in their old form.

Changing Celery to use JSON not only forces developers into good practices for task arguments; it also reduces the security risk that pickle’s arbitrary code execution can pose. To use JSON as the default task serializer, set an environment variable:

CELERY_TASK_SERIALIZER=json

Small, short-lived tasks

In a celery worker pool, multiple workers will be working on any number of tasks concurrently. Because of this, it makes sense to think about task design much like that of multithreaded applications. Each task should do the smallest useful amount of work possible so that the work can be distributed as efficiently as possible.

Because of the overhead involved in serializing a task, sending it over the network to the message broker, sending it back over the network to the worker, and deserializing the task is much higher than sending an message on an intra-process queue between threads, the bar for “useful” is of course higher as well. Issuing a task that does nothing but write a single row to a database is probably not the best use of resources. Making 1 API call rather than several in the same task, though, can make a big difference.

Short-lived tasks make deploys and restarts easier. When stopping a dyno, Heroku will kill processes a short amount of time after asking them to stop. This could happen as a result of a restart, scale down, or deploy. As a result, tasks which finish quickly will fail less often than long running ones.

Idempotent tasks

Celery tasks can fail or be interrupted for a variety of reasons. Rather than trying to anticipate everything that could possibly go wrong in a distributed system, you should strive for idempotence when designing tasks. Try never to assume the current state of the system when a task begins. Change as little external state as possible.

This is more of an art than a science, but the more tasks are re-runable without negative side-effects, the easier distributed systems can self-heal. For example, when an unexpected error occurs, idempotent tasks can simply tell Celery to retry the task. If the error was transient, idempotence allows the system to self-heal without human intervention.

Using acks_late

When celery workers receive a task from the message broker, they send an acknowledgement back. Brokers usually react to an ack by removing the task from their queue. However, if a worker dies in the middle of a task and has already acknowledged it, the task may not get run again. Celery tries to mitigate this risk by sending unfinished tasks back to the broker on a soft shutdown, but sometimes it’s unable due to a network failure, total hardware failure, or any number of other scenarios.

Celery can be configured to only ack tasks after they have completed (succeeded or failed). This feature is extremely useful when losing the occasional task is not tolerable. However, it requires the task to be idempotent (the previous attempt may have progressed part of the way through) and short-lived (brokers will generally “reserve” a task for a fixed period of time before reintroducing it to the queue). Acks_late can be enabled task by task or globally for the Celery app.

Custom task classes

Even though a task may look like a function, Celery’s task decorator is actually returning a class which implements __call__. (This is why tasks can be bound to self and have other methods available to them like delay and apply_async.) The decorator is a great shortcut, but sometimes a group of tasks share common concerns which can’t be easily expressed purely functionally.

By creating an abstract subclass of celery.Task, you can build a suite of tools and behaviors needed by other tasks using inheritance. Common uses for task subclasses are rate limiting and retry behavior, setup or teardown work, or even a common group of configuration settings shared by only a subset of an app’s tasks.

Canvas primitives

Celery’s documentation outlines a set of primitives which can be used to combine tasks into larger workflows. Familiarize yourself with these primitives. They provide ways to accomplish significant and complex work without sacrificing the design principles outlined above. Particularly useful is the chord, in which a group of tasks are executed in parallel and their results passed to another task upon completion. These, of course, require a result store.