Scheduled Jobs with Custom Clock Processes in Python with APScheduler
Last updated 12 February 2016
The ability to schedule background jobs is a requirement for most modern web apps. These jobs might be user-oriented, like sending emails; administrative, like taking backups or synchronizing data; or even a more integral part of the app itself.
On a single server deployment a system level tool like
cron is the obvious choice to accomplish this kind of scheduling. However, when deploying to a cloud platform like Heroku, something higher level is required since instances of the application will be running in a distributed environment where machine-local tools are not useful.
The Heroku Scheduler add-on is a fantastic solution for simple tasks that need to run at 10 minute, hourly, or daily intervals (or multiples of those intervals). But what about tasks that need to run every 5 minutes or 37 minutes or those that need to run at a very specific time? For these more unique and complicated use cases running your own scheduling process can be very useful.
There are a few Python scheduling libraries to choose from. Celery is an extremely robust synchronous task queue and message system that supports scheduled tasks.
For this example, we’re going to use APScheduler, a lightweight, in-process task scheduler. It provides a clean, easy-to-use scheduling API, has no dependencies and is not tied to any specific job queuing system.
Install APScheduler easily with pip:
$ pip install apscheduler
And make sure to add it to your
Next you’ll need to author the file to define your schedule. The APScheduler Documentation has a lot of great examples that show the flexibility of the library.
Here’s a simple
clock.py example file:
from apscheduler.schedulers.blocking import BlockingScheduler sched = BlockingScheduler() @sched.scheduled_job('interval', minutes=3) def timed_job(): print('This job is run every three minutes.') @sched.scheduled_job('cron', day_of_week='mon-fri', hour=17) def scheduled_job(): print('This job is run every weekday at 5pm.') sched.start()
Here we’ve configured APScheduler to queue background jobs in 2 different ways. The first directive will schedule an interval job every 3 minutes, starting at the time the clock process is launched. The second will queue a scheduled job once per weekday only at 5pm.
While this is a trivial example, it’s important to note that no work should be done in the clock process itself for reasons already covered in the clock processes article. Instead schedule a background job that will perform the actual work invoked from the clock process.
Clock process type
Finally, you’ll need to define a process type in the Procfile. In this example we’ll call the process
clock, so the Procfile should look something like this:
clock: python clock.py
clock.py changes and redeploy your application with a
git push heroku master.
The final step is to scale up the clock process. This is a singleton process, meaning you’ll never need to scale up more than 1 of these processes. If you run two, the work will be duplicated.
$ heroku ps:scale clock=1
You should see similar output to the following in your Heroku logs.
2012-05-30T20:59:38+00:00 heroku[clock.1]: State changed from created to starting 2012-05-30T20:59:38+00:00 heroku[api]: Scale to clock=1, web=3 by firstname.lastname@example.org 2012-05-30T20:59:40+00:00 heroku[clock.1]: Starting process with command `python clock.py` 2012-05-30T20:59:41+00:00 heroku[clock.1]: State changed from starting to up 2012-05-30T20:59:48+00:00 app[clock.1]: Starting clock for 1 events: [ Queueing interval job ] 2012-05-30T20:59:48+00:00 app[clock.1]: Queuing scheduled jobs
Now you have a custom clock process up and running. Check out the APScheduler Documentation for more info.