Platform Updates, Maintenance and Notifications
Last updated 09 September 2015
Occasionally, Heroku needs to conduct maintenance on the platform that might cause customer-visible changes or require certain features to be disabled temporarily.
The expected impact of the work, governs the way we provide notifications. This document describes the type of platform work that we carry out, and the impact of that work on notifications that will be displayed on the status site.
Types of work
Work falls into one of these categories:
- Routine updates
- Service updates
- Maintenance windows
- Urgent maintenance
Routine updates are normal deployments which won’t cause any impact to stable production apps or to the development tools. They happen with no impact to the functionality of the platform or customer applications because of the redundancy that is designed into the platform. The same features of the platform that manage uptime and reliability for customer apps allow us to make most changes without interrupting the day-to-day operations and improvements to the system. Because of this, we don’t provide notice of this work.
Service updates are work which will interrupt the functionality of deployment workflows and tools. These changes to the platform will affect the availability of the deployment workflow or tools, but they won’t affect apps that are already running. These changes may require the API to be in maintenance mode or may interrupt builds in progress. It may also, in some cases, prevent unidling of single-dyno apps.
When we need to perform a service update, we will put a notice on the status site at least 3 business days before the work takes place. This announcement will include the scheduled time of the work and the expected impact. We will update the status site again to indicate when the work has begun and when it has ended. If any changes to the work are required, we will update the status site accordingly.
Maintenance windows are changes to the platform that will affect stable production applications. Work of this type is rare, and we do everything we can to avoid it. When we do need to do it, we take care to schedule this work outside the peak hours for the region it will be performed in.
When we need a maintenance window, we will put a notice on the status site at least 5 business days in advance. The announcement will include the scheduled time and expected impact. We will update the status site again to indicate when the work has begun and when it has ended. If any changes to the work are required, we will update the status site accordingly.
Urgent updates are changes to the platform that may fall under any of the above categories that must happen quickly in order to respond to a problem that could affect the health of the platform or the integrity of customer data. Urgent updates should be rare and by their nature difficult to categorize. An example may be a response to security issues (like Heartbleed).
When we need to perform an urgent update that might cause development or production impact, we will take into account the possible impact, the time of day, and the risks associated with delays when we select a time to do the work. If possible, we will provide advance warning on the status site. In all cases, we will use the status site to communicate what we’re doing, what impact is possible, and when we’re finished.
When we need to perform work that will have a development or production impact, we will use the status site to notify you in advance. This will create an alert message across the top of the status site and will generate a tweet and send an email to subscribers.
When we begin work, we will update the status site again. This will create an incident in the timeline portion of the status site (which will be colored blue), and it will generate a tweet and notify subscribers via email and SMS.
If the work is going to take longer than we planned or if we need to provide updates on our progress, we’ll update the status site the way we do during an incident. When we’re finished, we’ll resolve the incident. Both of these actions will create the same notifications that you usually see during an issue (tweet and email).
If something goes wrong during the work, we will change the status site to show a regular issue. This will generate the usual notifications (tweet, email, and SMS). At that point, we will handle the notifications the way we do for any other issue with the platform.