Scheduled Jobs and Custom Clock Processes
Last updated 13 August 2015
Many apps need to run jobs at scheduled times. For example, polling a remote API every 5 minutes, or sending email reports every night at midnight. Cron has been used historically, but is ill-suited for horizontally scalable environments such as Heroku. A more powerful, flexible solution is a clock process.
See the simple job scheduling section for less demanding use-cases.
A clock process uses the process model to run a lightweight singleton process that wakes up at a specified interval and schedules the work to be performed. When paired with background workers it forms a very clean and extensible approach to job scheduling.
Scheduling a job and executing a job are two related but independent tasks. Separating a job’s execution from its scheduling ensures the responsibilities of each component are clearly defined and results in a more structured and manageable system.
Use a job scheduler only to queue background work and not to perform it. Background workers then receive the work to be executed out of process from the scheduler.
Appropriately decoupled scheduling and execution components will be able to scale independently. To avoid duplicate work being scheduled, or the need for complicated locking logic, job scheduling should be performed by a single component. The background workers, however, can be scaled to meet demand.
Simple job scheduling
Apps that have very basic scheduling needs such as executing a task on a very coarse-grained interval (daily, hourly or every 10 minutes) can use the Scheduler add-on. It is a free tool that suffices for simple recurring jobs.
However, for applications that demand the ability to define a much more specific execution interval (three times a day, every two hours or even every 5 seconds) it is necessary to implement a custom clock process to schedule jobs.
Custom clock processes
Beyond the ability to specify a custom schedule, a clock process has the additional benefit of being defined as part of the process model - consistent with other logical application components. This simplifies testing, reduces external dependencies and increases environment parity between development and production.
Defining custom clock processes
Custom clock implementations vary greatly by language. However, as part of an application’s process model, defining a clock process is very simple. Here is the
Procfile for a typical Node.js application.
web: node web.js worker: node worker.js clock: node clock.js
Conceptually, the contents of
clock.js are immaterial. What is important is that the
clock.js process is responsible only for determining what jobs to run at what interval and for scheduling those jobs to be run in the background.
The background worker defined in
worker.js is then responsible for receiving and immediately executing the scheduled work.
Clock processes on Heroku
As previously mentioned, the clock component should be a singleton process to avoid scheduling duplicate jobs and the need for complicated locking logic. Once deployed to Heroku simply scale the
clock process to a single dyno.
$ heroku ps:scale clock=1 Scaling 'clock' processes... done, now running 1
worker process may need additional dynos if the scheduled jobs represent a material increase in processing. At the very least one
worker dyno will need to be running to receive and execute scheduled jobs.
Since dynos are restarted at least once a day some logic will need to exist on startup of the clock process to ensure that a job interval wasn’t skipped during the dyno restart.
There are many libraries and services that allow you to implement scheduled jobs in your applications. Some concrete examples of scheduled job implementations in various languages include: