This add-on is operated by Librato
Heroku Monitoring | Application Performance Monitoring | Root Cause Analysis.
Last updated 30 September 2015
Table of Contents
Librato is a complete solution for monitoring and analyzing the metrics that impact your business at all levels of the stack. It provides everything you need to visualize, analyze, and actively alert on the metrics that matter to you. Librato integrates seamlessly with Heroku to provide you out-of-the-box detailed information about your application’s performance, per-dyno system resources, and postgres database. But that’s just the beginning!
With drop-in support for Rails 3.x/Rack, JVM-based applications, and other languages you’ll additionally have custom application metrics streaming into Librato in minutes. From there you can build tailored charts/dashboards, annotate them with one-time events, and set threshold-based alerts. Collaboration is supported through multi-user access, private dashboard links, PNG chart snapshots, and seamless integration with popular third-party services like PagerDuty, Slack, and HipChat. Additionally Librato is first-and-foremost a platform whose complete capabilities are programmatically accessible via a RESTful API with bindings available for a growing host of languages including Ruby, Python, Java, Go, Clojure, Node.js, etc.
Provisioning the add-on
Librato can be attached to a Heroku application via the CLI:
A list of all plans available can be found here.
$ heroku addons:create librato -----> Adding librato to sharp-mountain-4005... done, v18 (free)
After successful provisioning, the native log-based metrics described below will automatically begin flowing into your Librato account and into a Librato Spaces dashboard we’ve pre-configured to provide a basic overview of your application and database performance. It may take 2-3 minutes for data to begin showing up. You can go to the Librato interface now to check things out or read on to learn more!
Native log-based metrics
After provisioning the add-on, Librato immediately begins collecting native Heroku metrics pertaining directly from your application’s log stream.
Router metrics are present in every Heroku application log stream and provide detailed information (including timing) about every request serviced by your application. As these details are sourced directly from the Heroku routing layer itself, it’s the only true measure of performance as experienced by your customers, accounting for any delays introduced by Heroku in addition to your application’s processing. Librato calculates a rich set of summary statistics (described below in custom distributions) including median, 95th percentile, and 99th percentile latency that provide
unprecedented insight into your application’s performance. The metrics (described in the linked Heroku router docs) available in Librato currently are
router.status metrics capture all 200, 300, 400 and 500-level status codes from the Heroku Router for your applications. Paid plans have access to an HTTP Status Codes dashboard containing all of the status code metrics stacked in a single chart, and also broken out into individual charts for easier visibility. Development plans include the single stacked chart on their Overview space.
Postgres metrics are present in the log stream of any Heroku application with a provisioned Heroku Postgres database. These metrics are summary statistics gathered by the Heroku Postgres service and are reported directly into Librato under their given names.
Per-dyno runtime metrics are available to any Heroku application, but must be enabled by the user, as they are disabled by default. They provide insight into both memory usage and CPU load of each dyno and are reported under the dyno’s logical role e.g.
worker.2. You can enable them with the following commands:
$ heroku labs:enable log-runtime-metrics Enabling log-runtime-metrics for ... done $ heroku restart
Once enabled, your per-dyno runtime metrics will automatically begin showing up in your Librato account. These metrics are summary statistics gathered by the Heroku Runtime layer and are reported directly into Librato under their given names.
Custom log-based metrics
In addition to seamlessly extracting native metrics provided by Heroku, Librato’s log integration enable you to publish your own custom metrics directly to Librato through your Heroku log stream. Simply write your measurements to standard output (stdout) using one of the formats specified below and Librato will automatically detect and collect your custom metrics. The examples below are in Ruby syntax but can be replicated in any language that permits writing to standard output.
Count log lines are used to submit increments to Librato. You can submit increments as frequently as desired and every minute the current total will be flushed to Librato and reset to zero. For example the following combination would result in a value of 5 being recorded for the enclosing minute in a Librato metric named
$stdout.puts("count#user.clicks=1") $stdout.puts("count#user.clicks=1") $stdout.puts("count#user.clicks=3")
Measure log lines are used to submit individual measurements that comprise a statistical distribution. The most common use case are timings i.e. latency measurements, but it can also be used to represent non-temporal distributions such as counts. You can submit as many measures as you’d like (typically they are submitted per-request) and every minute Librato will calculate/record a complete set of summary statistics over the measures submitted in that interval:
- min - The smallest value.
- max - The largest value.
- average - The average of all values.
- median - The value that falls precisely in the middle of all values. Considered a good indicator of typical application performance.
- 95th percentile - The value that is larger than 95% of all values.
- 99th percentile - The value that is larger than 99% of all values.
- sum - The total of all values.
- count - The total number of values submitted.
Measures are submitted in a similar fashion to counts:
Note that measures create 4 separate metrics in Librato based on the submitted measure name. Using the example above, you would find:
database.query- A complex metric that contains the min/mean/max/sum/count values. You can switch between them using Librato’s summary statistics support.
database.query.median- The median value.
database.query.perc95- The 95th percentile value.
database.query.perc99- The 99th percentile value.
Custom sample metrics
Sample metrics are used to convey simple key/numerical value pairs when you are already calculating some kind of summary statistic in your app and merely need a simple transport mechanism to Librato. Typically you would submit sample metrics on some periodic tick and set said period on the metric in Librato. This example would create a metric in Librato with the name
Librato supports an additional dimension to the metric name called the source. This enables you to report the same metric associated with multiple different entities and intelligently share a single set of visualizations, alerts, etc. For example our support for Heroku’s native runtime metrics uses the dyno’s logical role as the source to differentiate between dynos. You associate a measurement with a source by adding a source key=value pair to the log line:
Note that each source value used in a metric creates a unique data stream defined by (metric, source) that counts against your current plans capacity for custom metrics as described below.
Being able to correlate regressions with code deployments is immensely helpful, so we’re also tracking your Heroku app releases as Librato annotations. Release annotations have been added to all charts on the Overview dashboard. Paid accounts can add them to any chart using the releases annotation name.
Heroku customers on our Bronze plan and higher can send an unlimited number of custom annotations using the same log format as our custom metrics, but with the
event keyword. For example, to record an annotation when your code raises an exception:
We also support optional fields including
source. Links are in the format of
link#<rel>=<url>, where the rel describes the relationship of the link. Custom annotations support multiple links per
event, however each link’s relationship must be unique within a single annotation event.
$stdout.puts('event#signup="New User" description="John Doe signed up via web" link#metrics="https://metrics.librato.com/dashboards/foo" start_time=1234567890 end_time=9876543210 source=us-east')
Configuring additional integrations
Once Librato has been added, settings for
LIBRATO_TOKEN will be available in the app configuration and will contain the credentials needed to authenticate any integrations to the Librato API. This can be confirmed using the
heroku config:get command.
$ heroku config:get LIBRATO_USER email@example.com
After installing Librato you will need to explicitly set a value for
LIBRATO_SOURCE in the app configuration.
LIBRATO_SOURCE informs the Librato service that the metrics coming from each of your dynos belong to the same application.
$ heroku config:set LIBRATO_SOURCE=myappname
The value of
LIBRATO_SOURCE must be composed of characters in the set
A-Za-z0-9.:-_ and no more than 255 characters long. You should use a permanent name, as changing it in the future will cause your historical metrics to become disjoint.
Using with Ruby
Ruby is currently supported as either Rails 3 or Rack applications. For other Ruby environments please contact us through one of the methods described below in the Support section.
Rails 3 Installation
Verify that the
LIBRATO_SOURCE variables are set. Ruby-on-Rails applications need to add the following entry into their
Gemfile specifying the Librato client library.
Then update application dependencies with bundler.
$ bundle install
Finally re-deploy your application.
$ git commit -a -m "add librato-rails instrumentation" $ git push heroku master
The source code and a detailed
librato-rails are available on GitHub.
Automatic Rails Instrumentation
After installing the
librato-rails gem and deploying your app you will see a number of metrics appear automatically in your Librato account. These are powered by ActiveSupport::Notifications and track request performance, sql queries, mail handling, etc.
Built-in performance metric names will start with either
rails, depending on the level they are being sampled from. For example:
rails.request.total is the total number of requests rails has received each minute.
Support for optionally disabling automatic instrumentation in
librato-rails is currently under development. In the interim as a workaround you can instead use
librato-rack (installation described below) to access the same custom instrumentation primitives without any automatic Rails metrics.
Verify that the
LIBRATO_SOURCE variables are set. Rack applications need to add the following entry into their
Gemfile specifying the Librato client library.
Then update application dependencies with bundler.
$ bundle install
Then in your rackup file (or equivalent), require and add the middleware:
require 'librato-rack' use Librato::Rack
Finally re-deploy your application.
$ git commit -a -m "add librato-rack instrumentation" $ git push heroku master
The source code and a detailed
librato-rack are available on GitHub.
Once you’ve installed Librato in either your Rails 3 or Rack application, you can immediately and easily start adding your own custom instrumentation. There are four simple instrumentation primitives available:
Use for tracking a running total of something across requests, examples:
# increment the 'sales_completed' metric by one Librato.increment 'sales_completed' # increment by five Librato.increment 'items_purchased', :by => 5 # increment with a custom source Librato.increment 'user.purchases', :source => user.id
Other things you might track this way: user signups, requests of a certain type or to a certain route, total jobs queued or processed, emails sent or received.
increment is primarily used for tracking the rate of occurrence of some event. Given this
increment metrics are continuous by default i.e. after being called on a metric once they will report on every interval, reporting zeros for any interval when increment was not called on the metric.
Especially with custom sources you may want the opposite behavior, i.e. reporting a measurement only during intervals where
increment was called on the metric:
# report a value for 'user.uploaded_file' only during non-zero intervals Librato.increment 'user.uploaded_file', :source => user.id, :sporadic => true
Use when you want to track an average value per-request. Examples:
Librato.measure 'user.social_graph.nodes', 212
Librato.measure this is per-request, but specialized for timing information:
Librato.timing 'twitter.lookup.time', 21.2
The block form auto-submits the time it took for its contents to execute as the measurement value:
Librato.timing 'twitter.lookup.time' do @twitter = Twitter.lookup(user) end
There is also a grouping helper, to make managing nested metrics easier. So this:
Librato.measure 'memcached.gets', 20 Librato.measure 'memcached.sets', 2 Librato.measure 'memcached.hits', 18
Can also be written as:
Librato.group 'memcached' do |g| g.measure 'gets', 20 g.measure 'sets', 2 g.measure 'hits', 18 end
Symbols can be used interchangably with strings for metric names.
Troubleshooting with Ruby
Check the logs for messages such as this
[librato-rails] halting: source must be provided in configuration.
librato-rack gems support multiple logging levels that are useful in diagnosing any issues with reporting metrics to Librato. These are controlled by the
$ heroku config:set LIBRATO_LOG_LEVEL=debug
Set your log level to
debug to log detailed information about the settings the gem is seeing at startup and when it is submitting metrics back to the Librato service.
If you are having an issue with a specific metric, setting a log level of
trace additionally logs the exact measurements being sent along with lots of other information about instrumentation execution.
Neither of these modes are recommended long-term in production as they will add quite a bit of volume to your log stream and will slow operation somewhat. Note that submission I/O is non-blocking, submission times are total time - your process will continue to handle requests during submissions.
For more information on the features available within the Librato interface please see the Librato knowledgebase.
The Librato interface allows you to build custom dashboards, set threhold-based alerts, rapidly detect and diagnose performance regressions in your production infrastructure, gain a deeper, shared understanding of your business across your team, and so much more!.
The interface can be accessed via the CLI:
$ heroku addons:open librato Opening librato for sharp-mountain-4005...
or by visiting the Heroku apps web interface and selecting the application in question. Select Librato from the Add-ons menu.
With Librato alerts, we’ll notify you when one of your key metrics exceeds a threshold or stops reporting. You can be notified via email, PagerDuty, Slack, or a number of other possible services.
To get you started, your Add-On account comes preconfigured with two alerts (which are disabled by default), designed to help you stay on top of problematic code deploys:
- Your application returns errors (HTTP 5xx response codes) for more than 1% of requests
- Your application’s response time averages over 1 second
These alerts are configured to email the Heroku account owner. To customize and enable these alerts, first visit the alerts page. Then, to enable an alert, click on the alert name or on the edit button. On the page that opens, click on enabled, then save.
To learn more about Librato alerts, visit our knowledge base article. Note that alerts are only available to users on a paid Librato Add-On plan.
It may take 2-3 minutes for the first results to show up in your Librato account after you have deployed your app and the first request has been received.
Note that if Heroku idles your application, measurements will not be sent until it receives another request and is restarted. If you see intermittent gaps in your measurements during periods of low traffic this is the most likely cause.
For troubleshooting instructions more specific to your particular platform, please see our polyglot documentation provided above. The documentation for each supported platform ends with a troubleshooting subsection titled in the form Troubleshooting with ….
Picking a plan
development plan is free and intended to provide you with some basic, but extremely useful information about your application running on Heroku. It includes all the native router/postgres metrics and presents them to you in a single dashboard that displays the last hour of data. To access native runtime metrics, record custom metrics, configure alerting, access more than the last hour’s worth of data, etc you’ll need to upgrade to a paid plan.
Depending on your intended usage level there are several plans to choose from. Each plan includes the native router and postgres metrics and offers an increasingly larger capacity for other metrics. This capacity is consumed by both native runtime metrics (which scale with the number of dynos) and custom metrics. So when choosing a plan you need to account for the number of custom metrics you intend to track and (if you’ve enabled runtime metrics) the number of dynos.
Per-dyno runtime metrics are available to any Heroku application and currently disabled by default. If you enable runtime metrics (which we highly recommend) each dyno adds 9 metrics.
To pick the right plan, multiply the number of dynos by 9, add the number of custom metrics and compare the result with the plan’s metric allocation. For example if you are running an app with 25 dynos and you plan on tracking 50 custom metrics your total metric count is:
25 x 9 = 225 (dyno Runtime metrics) + 50 Custom metrics = 275 metrics. In this case you would pick the Gold plan that has an allocation of 300 metrics.
Custom metrics are typically aggregated across dynos in an application and can be directly decremented from the plan’s capacity when estimating. Do keep in mind that some primitives (e.g. the
measure# supported in our logging integration) may report different parts of a distribution as multiple metrics.
The only other component that some advanced users may need to consider is submitting metrics at resolution higher than the default of 60s. All metrics submitted through our logging integration are at 60s resolution, so this section only applies if you are directly submitting metrics to Librato’s API from your application code at an interval of less than 60s. Note that the
nickel plan doesn’t support sub-60s reporting, so you must be on
bronze or higher. If these conditions apply we normalize your resolution to 60s to calculate the capacity consumed. For example, 10s resolution metrics consume metric capacity at a rate of 2.5x 60s resolution metrics so the Bronze plan that includes 50 metrics at 60s resolution and 20 metrics at 10s resolution.
Migrating between plans
As long as the plan you are migrating to includes enough allocated measurements for your usage, you can migrate between plans at any time without any interruption to your metrics.
heroku addons:upgrade command to migrate to a new plan.
$ heroku addons:upgrade librato:gold-10 -----> Upgrading librato:gold-10 to sharp-mountain-4005... done, v18 ($49/mo) Your plan has been updated to: librato:gold-10
Removing the add-on
Librato can be removed via the CLI.
This will destroy all associated data and cannot be undone!
$ heroku addons:destroy librato -----> Removing librato from sharp-mountain-4005... done, v20 (free)
Before removing Librato data can be exported through the Librato API.
Terms of Service
By installing the Librato Add-on, you agree to Librato’s Software Service Agreement.
All Librato support and runtime issues should be submitted via one of the Heroku Support channels. Any non-support related issues or product feedback for Librato is welcome via email, live chat, or the support forum.