This add-on is operated by Citus Data
Horizontally scalable Postgres
Last updated 30 September 2016
Table of Contents
The Citus extension horizontally scales PostgreSQL across multiple machines using sharding and replication. Its query engine parallelizes incoming SQL queries across these servers to enable human real-time (less than a second) responses on large datasets.
It is an open-source extension on top of PostgreSQL rather than an entire fork, which gives developers and enterprises the power and familiarity of a traditional relational database. As an extension, Citus supports new PostgreSQL releases, allowing users to benefit from new features while maintaining compatibility with existing PostgreSQL tools.
When to Use Citus
Citus provides users real-time responsiveness over large datasets, most commonly seen in rapidly growing event systems or with time series data. Example use cases include:
- Analytic dashboards with sub-second response times
- Exploratory queries on unfolding events
- Large dataset archival and reporting
- Analyzing sessions with funnel, segmentation, and cohort queries
For concrete examples check out our customer use cases. Typical Citus workloads involve ingesting large volumes of data and running analytic queries on that data in real-time.
Provisioning the add-on
Citus can be attached to a Heroku application via the CLI:
A list of all plans available can be found here.
$ heroku addons:create citus -----> Adding citus to sharp-mountain-4005... done, v18 (free)
Once Citus has been added a
CITUS_URL setting will be available in the app configuration and will contain the canonical URL used to access the newly provisioned Citus formation. This can be confirmed using the
heroku config:get command.
$ heroku config:get CITUS_URL postgres://user:firstname.lastname@example.org/citus
After installing Citus the application should be configured to fully integrate with the add-on.
Creating distributed tables
Because Citus is an extension on top of Postgres, creating, modifying, updating, and deleting data works exactly the same as vanilla Postgres. To take advantage of the distributed features of Citus, you must mark specific tables as distributed tables.
To create distributed tables on your Citus cluster you first run:
SELECT master_create_distributed_table('table_name', 'shard_key', 'hash');
And then to actually create your shards you’d run:
SELECT master_create_worker_shards('table_name', 16, 1);
For more on detailed usage and setup you can read the Citus Docs
Local development setup
In many cases, a normal Postgres database can be used locally, and instructions for setting up a local Postgres server can be found here.
In cases where higher fidelity is necessary you can either run install a local, single-machine cluster or run against a remote Citus formation by provisioning the add-on and locally replicating the config vars so your development environment can operate against the service.
Remote service setup
The Citus add-on can be used remotely in lieu of setting up a local single-machine cluster. First provision the add-on using the above guide. You can then use the
CITUS_URL for local development. Use the Heroku Local command-line tool to configure, run and manage process types specified in your app’s Procfile. Heroku Local reads configuration variables from a
.env file. To view all of your app’s config vars, type
heroku config. Use the following command for each value that you want to add to your
$ heroku config:get CITUS_URL -s >> .env
Credentials and other sensitive configuration values should not be committed to source-control. In Git exclude the
.env file with:
echo .env >> .gitignore.
For more information, see the Heroku Local article.
Single-machine local cluster setup
Citus can be installed for use in a local development environment as a single-machine cluster. Typically this entails installing the database and pointing the
CITUS_URL to this local cluster.
Instructions for setting up a local, single-machine Citus cluster can be found here with instructions for Mac OS, Fedora/CentOS/Red Hat, Ubuntu/Debian, and Docker.
Removing the add-on
Citus can be removed via the CLI.
This will destroy all associated data and cannot be undone!
$ heroku addons:destroy citus -----> Removing citus from sharp-mountain-4005... done, v20 (free)
Before removing Citus, a data export can be performed by pg_dump.