This add-on is operated by Tinkomatic, LLC
Analytics and Data Visualisation based on Apache Superset
Last updated 05 December 2017
The Duperset add-on is currently in beta.
Table of Contents
Duperset allows you to get started with Superset in minutes and take advantage of a rich set of features focused on data exploration and visualization with charts and dashboards.
Provisioning the add-on
Duperset can be attached to a Heroku application via the CLI:
$ heroku addons:create duperset
After you provision Duperset, the
DUPERSET_URL config var is available in your app’s configuration. It contains the URL of your Superset instance. You can confirm this via the
heroku config:get command.
You can access Duperset via the CLI:
$ heroku addons:open duperset Opening duperset for sharp-mountain-4005
or by visiting the Heroku Dashboard and selecting the application in question. Select Duperset from the Add-ons menu.
First principles of Superset
There are a few key concepts you need to familiarize yourself with in order to get started with Superset. After you’ve done that, Superset’s UI is largely self-explanatory.
Superset provides a tight level of access control to your data. Unless you explicitly make them available as datasources, tables and their fields in your database are not available for visualization.
Datasources can be created either by defining available tables and associated fields; or by creating sql queries to act as datasources.
In either cases, you need to first configure the target databases you want to explore. The steps for configuring your databases are described in Connect to a target database.
After you configure your database, you can define a combination of:
- Tables you want to enable access to.
- SQL queries you can save and make available as datasources.
After you configure Superset access, you can create and edit both slices and dashboards.
- Slices are visualizations on your tables and related fields.
- Dashboards are sets of slices that can be shared.
The steps for creating slices and dashboards are described in Creating slices and dashboards.
Everything is a record
In the Superset UI (and underneath its hood), every component of your configuration is represented as a record. This includes the configuration of your databases, tables, and slices.
This unified model allows for fine-grained access control, but it can also be confusing at first. Language in the UI that says “show record” in fact means “show the details of this database configuration.”
The core metrics you define and measure as a Superset admin are also represented as records to view, edit, or delete:
Note that these are actual records that are stored in a separate configuration database that Duperset automatically provisions and maintains for your add-on instance.
Change the admin password
Your provisioned Superset instance comes with a default
admin account. After you provision Duperset, your first step should be to log in and change the admin account password:
Log in with username
Navigate to your Profile:
Click Reset my password:
Reset your password and click Save:
Connect to a target database
After you set your admin password, you can connect to the databases you want to explore. Whenever possible, connect to a follower database rather than to your app’s primary production database.
Navigate to Menu > Sources > Databases:
+button on the far right to configure a new database source.
The available fields are mostly self-explanatory:
One critical field to note: the value of the
SQLAlchemy URIfield should be the database URL specified in your Heroku config. It should have the following format:
5432with your port for Postgres if you aren’t using the default port.
Make sure that the URL starts with
postgresqland not just
Create datasources using tables
After you’ve configured the database(s) you want to connect to, you can create datasources for your visualizations either by configuring tables, or by defining queries using sql lab.
When using tables, you need to explicitly define which tables in your database you want to give access to, and for each table, which fields you want to give access to.
The steps for adding a new table are similar to those for adding a new database:
Navigate to Menu > Sources > Tables:
+button on the far right to configure a new table source.
After creating the record, edit it to specify which fields will be available.
On the Edit Table screen, navigate to the List Columns tab:
Here, you define which fields are available to your users, along with underlying semantics to apply to them (which fields can be grouped, counted, summed, and so on).
These fields become available to your users as metrics:
At this point, you and your users have the basic blocks in place to start creating visualizations (“slices”) and dashboards.
Create datasources using sql lab
The alternative to defining tables as datasources is building sql queries using sql lab:
Navigate to Menu > SQL Lab. The sql editor allows you to write and run sql queries.
Click on the Visualize button when you want to make a query available as a datasource..
Here, you can select the type of visualization you want to build, change the name of the datasource, as well as define dimensions you’re making available in your visualization.
Creating slices and dashboards
To create a new visualization (“slice”):
Navigate to the Slices page and add a “new record.”
Choose a datasource and a visualization type:
Proceed the Exploration screen to view your created slice:
A couple of things to note:
As you modify your slice, you need to click Query to see your changes reflected (unless the configuration you’re modifying is labeled with a flash sign, in which case the change takes effect immediately).
Every new slice’s title is
[undefined] - untitled. When you hover over the title, a tooltip will tell you that you don’t have the rights to alter it, but you do: simply click Save and give the slice a name:
Superset’s documentation is sparse and the depth of its functionality very extensive - we recommend you play around with the different types of visualizations, Advanced Analytics and SQL features, and Filters.
You create dashboards from the Dashboard tab, with steps similar to those for creating slices.
For additional documentation on how to explore your data and create slices and dashboards, see the official Superset documentation.
Managing user access
Create new users by navigating to Menu > Security > List Users. Click the
+ button to Add a new record.
Enter all required fields and specify the roles this user should have (described in the next section).
A few things to note:
- Ensure that the Active checkbox is checked if the new user should be able to access the instance.
- Every username must be unique.
- An email address is required but is currently not being used by Duperset. Duperset does not send any form of notification to a new user that their account has been created. You need to notify the user manually.
- You need to specify (and confirm) a password for your new user. You should instruct your users to immediately change their password when they log in.
Superset ships with 5 default base roles:
Although it’s possible to modify the details of Superset’s default roles, this is not recommended.
Admin - Admins have all possible rights, including granting or revoking rights from other users and altering other people’s slices and dashboards.
Alpha - These are your power-users. They have access to all data sources, but they can’t grant or revoke access from other users. Alpha users can only alter the objects that they own. They can add and alter data sources.
Gamma - Gamma users can create slices and dashboards only for data sources they have been granted access to by a user with permission to do so. They cannot alter or add data sources.
Also note that when Gamma users view existing dashboards and slices, they can see only objects that they have access to.
sql_lab - The sql_lab role grants access to SQL Lab. Note that while Admin users have access to all databases by default, both Alpha and Gamma users need to be given access on a per database basis.
A Public role, which allows anonymous users to view dashboards, is disabled on Duperset.
To keep your Duperset instance secure, customize user access by creating new roles alongside Superset’s default roles.
For instance, you could create a Financial Analyst role for users that require access to a specific set of databases and/or tables. These users would be granted the Gamma role (and probably sql_lab as well) in addition to Financial Analyst.
To provide Gamma users access only to specific datasets:
Make sure the users with limited access have [only] the Gamma role assigned to them.
Second, create a new role (Menu > Security > List Roles) and click the
+button on the far right.
This new window allows you to give this new role a name, attribute it to users, and select applicable tables in the Permissions dropdown.
To select the data sources you want to associate with this role, simply click in the dropdown and use the typeahead to search for your table names.
You can then confirm with your Gamma users that they see the objects (dashboards and slices) associated with the tables related to their roles.
Removing the add-on
You can remove Duperset via the CLI:
This will destroy all associated data and cannot be undone!
$ heroku addons:destroy duperset -----> Removing duperset from sharp-mountain-4005... done, v20 (free)
All Duperset support and runtime issues should be submitted via one of the Heroku Support channels.