FlyData

This add-on is operated by Hapyrus

Automatically backup and analyze your Heroku logs on Amazon S3 and Redshift!

FlyData

Last Updated: 18 January 2014

Table of Contents

FlyData is the log backup solution for everybody, offering a way to back up your logs indefinitely! Using FlyData provides a hassle free way to keep all your logs safe without fear of deletion.

We will be adding a limit to our “Log Transfer Bandwidth Monthly Limit to Amazon S3” starting 12/15/2013, for each of our plans.

Installation

FlyData can be installed through the add-on page as well as through the command line as follows

$ heroku addons:add flydata

Afterwards, your logs will be automatically backed up to our service storage. By default, you will be assigned a space with a storage size based on the selected plan. However, if you provide your own S3 bucket, there is no size limit no matter the plan. You may do this through our page via the Heroku add-ons page. This will help save on storage costs and maintain control of your data.

Dashboard

From our dashboard, you can download any of your logs. These logs are based on the 3 types of log data from the Heroku platform.

Dashboard

The data entries on the dashboard are statistics for your S3 uploads only. They do not reflect the statistics for uploads to Amazon Redshift.

From the dashboard, you can also access a settings page for FlyData. It is here you can setup your S3 bucket integration, Amazon Redshift integration, and your email alerts.

S3 integration

By default, your account will be using our storage that we create for you. You may continue to use this storage but you will have no access to it other than through our interface. If you want, you can use your own S3 bucket. Just so you know, there is a storage limit based on your plan for bandwidth to Amazon S3.

AWS configuration

To do this, first go to your Amazon AWS Console and select the Properties button on the top right.

S3Buttons

Next, select the bucket you want to use and the properties should appear on the right. Under the Permissions tab of the Properties page, you will need to click the “Add more permissions” button and type in aws@flydata.co next to the Grantee field. Also, make sure to check List, Upload/Delete, and View Permissions next to the Grantee field.

S3Permissions

Click the save button.

Heroku Add-on configuration

Open the FlyData Heroku Add-ons dashboard to specify the bucket to store logs.

$ heroku addons:open flydata

Go to the settings page located at the top of the page. Enter your bucket name under where it asks for it and select the region your bucket is located.

FlydataS3Settings

Finally, click the submit button and the logs will now be stored in your own S3 bucket

Amazon Redshift integration

Along with backing up to an Amazon S3 bucket, we also allow storing these Heroku logs on an Amazon Redshift cluster for analysis of all of the logs.

The frequency in which we upload to your Redshift cluster will depend on your plan. All plans below our Owl plan will upload to Redshift once per hour while our plans above Owl will upload every 5 minutes.

For more details on Amazon Redshift, check out the Amazon Redshift site. This feature will need to be enabled to be put into use.

Heroku Add-on configuration

Go to our Settings page and scroll down to “Amazon Redshift setting”

FlydataRedshiftSetting

Here you can enter choose to use our Amazon Redshift sandbox or to use your own cluster.

First check the box to store logs to Amazon Redshift Cluster. Next, select the type of logs we support. See our next section for more details on that. Finally, select which cluster you want to use. If you want to use our sandbox cluster, select that area and click save.

Otherwise, click the “User your own cluster” option and enter in your information.

FlydataRedshiftSetting2

You must enter your endpoint, port number, database name, username, and password for your cluster. Once you have confirmed these details, click save.

After saving, we will automatically create three tables: heroku_raw_log, heroku_access_log, and heroku_runtime metrics.

FlyData Redshift Sandbox

Because of Amazon Redshift’s high entry cost, we are providing free space on our own Amazon Redshift cluster. This will allow you to test out Amazon Redshift with our client. All of your logs will still be uploaded to Amazon Redshift and will be queryable. The amount of free space will be limited by your plan.

Log Support

We provide a way to upload a few types of logs to an Amazon Redshift cluster. After enabling the Amazon Redshift feature in your settings page, you will see three immediate options for the type of log you want to store.

Raw Logs: This will enable all your log entries to be stored in a table called ‘heroku_raw_log’. Access Logs: This will store all of the access logs in a table called ‘heroku_access_log’. Runtime Metrics: This will enable runtime metrics to be stored in the ‘heroku_runtime_metrics’ table. The entries will only start showing up after you have enabled runtime metrics.

Additionally, we automatically support JSON formatted logs that are output to STDOUT. To make this simple though, we have created a gem to help to create a compatible JSON entry everytime. First, install our gem on your Heroku application. To do this, add this to your Gemfile:

gem 'flydata'

Next, simply call this method when you want to print the JSON

Flydata.send_to(string, hash)

or

Flydata.send_to(string, array)

For this method, string is the string containing the name of the table you want to upload to. The hash contains the entry you want to upload to the Redshift cluster table. You can also use an array of hashes to upload multiple entries at once.

Alternatively, you can use an ActiveRecord object as an entry itself like here:

user = User.find(id)
user.send_to_flydata  #Sends user data to the 'user' table on Redshift

This will automatically print the correct JSON string to automatically upload to your table on your Amazon Redshift cluster. Along with supporting this JSON format, we will also automatically create the table if we see that the table does not already exist on the Redshift cluster. The columns will be created with the data we see from the entry so for best results, include all your columns in your hash. Of course, you can also create the table first to optimize your table first before starting to output these JSON entries.

A few notes: - For timestamp, your data will need to be in “YY-MM-DD HH:mm:SS” string format - For date, your data will need to be in “YY-MM-DD” string format

If you plan on creating your own table, you can create the table on our ‘Access Redshift’ menu page using SQL. This tool will allow you to run SQL on your Amazon Redshift cluster straight from your web browser.

Accessing Amazon Redshift

There are many tools to access your Amazon Redshift cluster. We have made one that works straight from your browser! To access this, go to the ‘Access Redshift’ page from our top menu. This option will only appear after you have enabled the Amazon Redshift Setting from the Settings page.

Here, you can enter any SQL that is compatible for Amazon Redshift (documentation). It is a great way to run SQL queries as well as create and modify tables on your Redshift cluster.

AWS Configuration

If you decide to use your own Amazon Redshift cluster, you may create a cluster from you Amazon AWS console similar to how you created an S3 bucket. Currently, we only support Redshift clusters created in the region us-east1.

First go to the Amazon AWS Console and select the Properties button on the top right.

Select Redshift and before you click the “Launch Cluster”, go to security groups on the side.

FlydataRedshiftSecurity

We suggest creating a new security group but you can use “default” also. On the security group page, click on the group you want to use for your Amazon Redshift cluster. Add a new EC2 Security Group connection type from the drop down menu and enter these credentials:

AWS Account ID: 481004789880 Security group: flydata

FlydataRedshiftAddSecurity

Click the Add button. Please note that if you wish to access your Amazon Redshift cluster from a local client like a postgresql client, you will need to add an additional connection type including your own IP address via the CIDR/IP option.

Go back to the main Redshift page and click the “Create Redshift Cluster” button.

FlydataRedshiftCreate

Enter in your desired master username, password, port number, and database name. Continue by clicking continue.

FlydataRedshiftCredentials

Select the size you want for your cluster. Each node supports up to 2TB of storage. We recommend multiple nodes for better query performance. After you have decided everything. click next.

FlydataRedshiftSize

Here, additional details will be asked including security groups, encryption, and availability zone. We only support Redshift clusters in the US-East1 zone for now so please make sure you are setting up the cluster in this zone.

FlydataRedshiftAdditional

Next, review all your information and “Launch cluster”

Your cluster may take some time to start up (~20 minutes for a single node).

Once launched, click on the cluster to access it’s details. Please take note of the endpoint especially for connecting with our add-on.

FlydataRedshiftShow

Email alerts

Email alerts are automatically enabled. You can get an error notification mail when you get Heroku platform errors and your application errors (Exceptions and Errors).

EmailSetting

To disable these alerts, go to your FlyData page through our dashboard. Once there, go to the Settings page located at the top of the page and then un-check Email Alert under Email Setting and click save.

To enable this feature again, go to the Settings page again and make sure the box next to Email Alert is checked before saving.

Email Address Change

If you want to change the email address that these alert emails go to, you can change so here.

Alert Frequency

You can also set the frequency upon which you get alert emails in this menu. If email alerts are enabled, you can set how often in an hour or a day that you want to get these alert emails. This can reduce the number of emails recieved from repeat errors. By default, the setting will be set at one email alert per day. We will support up to 10 emails per hour or 24 emails per day. Emails will be sent until this limit is hit.

Regular Expression Alert

FlydataEmailRegex

Alert emails will be sent out according to what is input here under the ‘Alert On’ column. The “Unless” column is for the terms that you don’t want to get email alerts for. By default, we look specifically for the strings “error”, “exception”, “critical”, and “fatal”. If you want to add other strings for us to specifically look for, click the “Add” button. The “Delete” button will delete the last entry on the list.

For example, with the default settings, we will send an alert email when we see this message:

ERROR SignalException: SIGTERM

But, you may keep getting an alert message for a message you want to ignore like this:

Rendered home/error.html within layouts/application (81.9ms)

If you want to exclude this false positive, you may add “error.html” to the “Unless” column and you will never see an error with “error.html” in it pop up again.

Support

All FlyData support and runtime issues should be submitted via on of the Heroku Support channels. If there are any other questions or inquiries, feel free to email us at support@flydata.co and we will get back to you as soon as possible