Scheduled Jobs with Custom Clock Processes in Python with APScheduler
Last updated 07 August 2014
The ability to schedule background jobs is a requirement for most modern web apps. These jobs might be user-oriented, like sending emails; administrative, like taking backups or synchronizing data; or even a more integral part of the app itself.
On a single server deployment a system level tool like
cron is the obvious choice to accomplish this kind of scheduling. However, when deploying to a cloud platform like Heroku, something higher level is required since instances of the application will be running in a distributed environment where machine-local tools are not useful.
The Heroku Scheduler add-on is a fantastic solution for simple tasks that need to run at 10 minute, hourly, or daily intervals (or multiples of those intervals). But what about tasks that need to run every 5 minutes or 37 minutes or those that need to run at a very specific time? For these more unique and complicated use cases running your own scheduling process can be very useful.
If you have questions about Python on Heroku, consider discussing it in the Python on Heroku forums. Both Heroku and community-based Python experts are available.
There are a few Python scheduling libraries to choose from. Celery is an extremely robust synchronous task queue and message system that supports scheduled tasks.
For this example, we’re going to use APScheduler, a lightweight, in-process task scheduler. It provides a clean, easy-to-use scheduling API, has no dependencies and is not tied to any specific job queuing system.
Install APScheduler easily with pip:
$ pip install apscheduler
And make sure to add it to your
Next you’ll need to author the file to define your schedule. The APScheduler Documentation has a lot of great examples that show the flexibility of the library.
Here’s a simple
clock.py example file:
from apscheduler.schedulers.blocking import BlockingScheduler sched = BlockingScheduler() @sched.scheduled_job('interval', minutes=3) def timed_job(): print('This job is run every three minutes.') @sched.scheduled_job('cron', day_of_week='mon-fri', hour=17) def scheduled_job(): print('This job is run every weekday at 5pm.') sched.start()
Here we’ve configured APScheduler to queue background jobs in 2 different ways. The first directive will schedule an interval job every 3 minutes, starting at the time the clock process is launched. The second will queue a scheduled job once per weekday only at 5pm.
While this is a trivial example, it's important to note that no work should be done in the clock process itself for reasons already covered in the clock processes article. Instead schedule a background job that will perform the actual work invoked from the clock process.
Clock process type
Finally, you’ll need to define a process type in the Procfile. In this example we’ll call the process
clock, so the Procfile should look something like this:
clock: python clock.py
clock.py changes and redeploy your application with a
git push heroku master.
The final step is to scale up the clock process. This is a singleton process, meaning you’ll never need to scale up more than 1 of these processes. If you run two, the work will be duplicated.
$ heroku ps:scale clock=1
You should see similar output to the following in your Heroku logs.
2012-05-30T20:59:38+00:00 heroku[clock.1]: State changed from created to starting 2012-05-30T20:59:38+00:00 heroku[api]: Scale to clock=1, web=3 by firstname.lastname@example.org 2012-05-30T20:59:40+00:00 heroku[clock.1]: Starting process with command `python clock.py` 2012-05-30T20:59:41+00:00 heroku[clock.1]: State changed from starting to up 2012-05-30T20:59:48+00:00 app[clock.1]: Starting clock for 1 events: [ Queueing interval job ] 2012-05-30T20:59:48+00:00 app[clock.1]: Queuing scheduled jobs
Now you have a custom clock process up and running. Check out the APScheduler Documentation for more info.