This add-on is operated by Adamlogic, LLC
Queue-based auto scaling for Rails, Sidekiq, Django, Celery, Node, and more.
Judoscale Universal Autoscaler
Last updated June 07, 2024
Table of Contents
- What is autoscaling?
- How does autoscaling work?
- How is Judoscale different from Heroku’s native autoscaling?
- What languages and frameworks does Judoscale support?
- Provisioning Judoscale
- Using the Judoscale UI
- Installing the adapter library for Ruby, Python, and Node applications
- Using Judoscale without an adapter library (for PHP, Java, etc.)
- Enabling and configuring your autoscaling
- Autoscaling worker dynos (Sidekiq, Celery, etc.)
- Changing plans and uninstalling
- Support
Judoscale is an add-on for Heroku, designed to automate the scaling of your app’s servers.
What is autoscaling?
Autoscaling dynamically adjusts the number or size of servers based on certain metrics. Engineering teams utilize this feature to maintain application performance while optimizing costs.
In Heroku, the unit of scaling is a dyno. Judoscale offers horizontal autoscaling, automatically fine-tuning the number of web and worker dynos for your Heroku apps.
How does autoscaling work?
An autoscaler monitors specific metrics and triggers autoscaling events when these metrics cross predefined thresholds. Judoscale can collect metrics through framework-specific libraries or directly from your application’s log stream.
How is Judoscale different from Heroku’s native autoscaling?
While Heroku provides native autoscaling for performance dynos, Judoscale addresses its limitations. Here’s a comparison:
Judoscale | Heroku | |
---|---|---|
Supports standard dynos | ✅ | ❌ |
Web metric | request queue time or response time | response time only |
Worker metric | queue latency or queue depth | ❌ (no worker autoscaling) |
Autoscale response | 30 seconds | 3 minutes |
Custom schedule | ✅ | ❌ |
Autoscale controls | dyno jumps, autoscale frequency, & more | ❌ |
Cost | varies based on scale | free |
What languages and frameworks does Judoscale support?
Any Heroku web app can use our response time autoscaling without a code change. For queue-based autoscaling, you’ll need to install an adapter library. We support multiple languages and frameworks.
- Ruby: Rails, Rack, Sidekiq, Solid Queue, Resque, Delayed Job, Good Job, Que, Shoryuken
- Python: Django, Flask, Celery, RQ
- Node.js: Express, Fastify
- PHP, Java, .Net, Go: No adapter yet—response time autoscaling only.
Provisioning Judoscale
Reference the Judoscale Elements Page for a list of available plans.
Install Judoscale using the Heroku CLI.
$ heroku addons:create judoscale:white
Creating judoscale on sharp-mountain-4005... free
Your add-on has been provisioned successfully
Autoscaling is turned off by default. You can turn it on using the Judoscale UI.
Using the Judoscale UI
Access the UI via the CLI or Heroku Dashboard.
$ heroku addons:open judoscale
Opening judoscale for sharp-mountain-4005
The Team Dashboard provides an overview of your apps, their processes, and autoscaling status.
Installing the adapter library for Ruby, Python, and Node applications
From the Team Dashboard, click “Set up autoscaling”. In the adapter installation modal, select your language, web framework, and job backend if you’re using one.
Based on these selections, Judoscale will walk you through how to install the adapter library for your stack.
Using Judoscale without an adapter library (for PHP, Java, etc.)
If we haven’t built an adapter library for your stack yet, you can still use Judoscale for your web dynos! After you’ve provided your stack info, the adapter will prompt you to enable response time monitoring.
This adds a log drain to your app so that Judoscale can parse the response time from your router logs. Once your response time is monitored, you’ll see it show up in the scaling charts in Judoscale, and you can turn on autoscaling when you’re ready.
Enabling and configuring your autoscaling
You can roll with the default settings and turn on Autoscaling using the toggle, but it’s a good idea to scroll down the page and familiarize yourself with the autoscale settings.
Queue time range (or response time range)
Your queue time range (or response time range when you’re using our response time autoscaling) is how Judoscale determines when to scale your application.
Read our deep dive into request queue time to learn more.
With adequate dyno capacity, most apps should be running with request queue times well under 25ms. Our default queue time range of 25-50ms means that any spike over 50ms will cause an upscale (more dynos), and we’ll scale you back down when request queue time has settled below 25ms.
Dyno Range
Dyno range determines how far Judoscale scales your application. For example, a dyno range of 2–9 dynos could autoscale up to 9 dynos, but no further. Likewise, Judoscale would never autoscale this app below 2 dynos.
Regardless of your dyno range setting, you can always manually scale your app outside that range using the Heroku CLI or the Resources page in your Heroku dashboard.
Applications with multiple running dynos will be more redundant against failure, but autoscaling with Judoscale allows your app to safely run a single dyno. Judoscale will detect performance problems with a single dyno and immediately trigger autoscaling to add more dynos.
Autoscale sensitivity
Use the sensitivity sliders to control the “curve” of your autoscaling:
- Upscale Jumps allows you to scale up by more than one dyno at a time. This is useful if your app experiences sudden increases in traffic that can’t be mitigated by scaling up by a single dyno.
- Upscale Frequency controls how frequently Judoscale will continue scaling up if queue time remains above your specified threshold. Note that the first upscale event is always immediate when the threshold is breached.
- Downscale Delay controls how frequently Judoscale will downscale your app. This also controls how long Judoscale waits before the first downscale event.
Autoscaling worker dynos (Sidekiq, Celery, etc.)
Unlike Heroku’s native autoscaling, Judoscale can autoscale your worker dynos (your background job queues). Once you’ve installed the adapter following the instructions above, you’ll see your “job queue time” on the scaling page.
Job queue time (also called queue latency) tells you how longs jobs are waiting in the queue before getting picked up for processing.
Autoscaling worker dynos works exactly the same as web dynos, but the default queue time range is different. Our default request queue time of 25–50ms wouldn’t make much sense for a job queue, so we default job queue time to 1–5 seconds.
This means we’ll scale you up when your jobs are waiting more than 5 seconds to be picked up for processing. This may be more or less aggressive than you need, so tweak it accordingly.
For worker dynos, you also need to specify which queues to monitor for autoscaling.
Only the selected queues are reflected in the queue time chart, and only those metrics will trigger autoscaling. If you have multiple worker processes handling dedicated queues, these checkboxes are how you identify which queues are being handled by each process.
Some adapters may not support collecting job queue time, in which case Judoscale will fallback to collecting job queue depth, the number of jobs waiting to be processed on each queue. In that case, it will be possible to configure a target queue depth range, i.e. the minimum and maximum jobs to which Judoscale should monitor for autoscaling.
Changing plans and uninstalling
To change plans, use heroku addons:upgrade
. To remove Judoscale, use heroku addons:destroy
.
Support
Email us at help@judoscale.com for the fastest support. You can also reach out to Heroku support.