Flying Sphinx

This add-on is operated by Flying Sphinx

Fast and simple Sphinx-driven full-text search

Flying Sphinx

Last Updated: 30 March 2014

addons ga

Table of Contents

Flying Sphinx isn’t much more than a wrapper over Sphinx (and if you’re using Ruby and ActiveRecord, Thinking Sphinx), which automates the connection to a managed Sphinx server for both searching and indexing. So for the most part, there’s not too much different to a normal Sphinx workflow.

If you’re not familiar with Sphinx, then it’s highly recommended you read through the Sphinx documentation and have it set up and working in your local development environment.

Please note: at this point in time, Flying Sphinx does not yet support Sphinx’s realtime indices. If you’d like this feature, please contact Flying Sphinx support.

Regions

Flying Sphinx is available in both US and European regions.

Languages

Flying Sphinx currently has client libraries for Ruby/Rails, Python and Node.js. If you wish to use a different language, you’ll need to talk to the Flying Sphinx API directly.

If you are using Ruby and Thinking Sphinx, then it’s highly recommended you read through the documentation for that library if you’re not already familiar with it:

Installation

First up, let Heroku know you want to use the Flying Sphinx add-on:

$ heroku addons:add flying_sphinx:wooden

You can find a list of plans here on Heroku’s website.

Ruby and ActiveRecord/Rails 3

You’ll need the Flying Sphinx gem as part of your Rails application. If you’re using Bundler, it’s as simple as adding it to your Gemfile and bundling as per normal:

gem 'thinking-sphinx', '3.1.0'
gem 'flying-sphinx',   '1.2.0'

If you’re using MRI 1.8, you’ll also need the openssl-nonblock gem:

gem 'openssl-nonblock', '0.2.1'

Ruby and ActiveRecord/Rails 2

A few things to note here:

  • Only Rails 2.3.6 or later is supported.
  • Instead of the command line tool mentioned throughout the documentation, use the legacy rake tasks instead (fs:index, fs:start, fs:stop, fs:restart and fs:rebuild), just like you would with Thinking Sphinx.
  • The openssl-nonblock gem is required if you’re using MRI 1.8.

And if you’re not using Bundler, add the gem to both your config/environment.rb file:

config.gem 'thinking-sphinx',
  :version => '1.5.0'
config.gem 'flying-sphinx',
  :version => '1.2.0'
config.gem 'openssl-nonblock',
  :version => '0.2.1'

… and your .gems file:

thinking-sphinx --version 1.5.0
flying-sphinx --version 1.2.0
openssl-nonblock --version 0.2.1

Lastly, if you’re using Rails 2 you need to add this line to the end of your Rakefile:

require 'flying_sphinx/tasks'

In my examples here, you’ll note that I’m always referencing Flying Sphinx after Thinking Sphinx. I recommend you do the same, otherwise you may end up with some dependency confusion.

You must use Thinking Sphinx 2.1.0 or later (for Rails 3), or 1.5.0 or later (for Rails 2.3). Older versions of Thinking Sphinx will not work with Flying Sphinx.

Python

The flyingsphinx package is available via pip:

$ pip install flyingsphinx

Node/Javascript

The flying-sphinx package is available via npm:

$ npm install flying-sphinx

Advanced Heroku Database Configurations with Ruby

Thinking Sphinx (and thus, Flying Sphinx) uses the connection attributes at the time the index is defined. Unless you’re doing something particularly creative, this occurs when the model is loaded, and is referencing the default database credentials (inserted into config/database.yml from DATABASE_URL).

If you want to do something not quite so standard, please contact Flying Sphinx support to talk through what approaches could work better for your setup.

Amazon RDS

If you’d like to use Flying Sphinx with Amazon RDS (via Heroku’s add-on for that service), then there’s one more step you’ll need to take care of: giving Flying Sphinx permission to access your MySQL database.

As part of the Amazon RDS setup, you will have given Heroku permission to access your database, and so just repeat that step once more - this time, using the owner id 092495821309 and group name Scalarium-Default-Server.

Configuration

Unless you’re using Ruby (see below), you’ll need to write a Sphinx configuration by hand, and then upload it to Flying Sphinx. That last step can be done easily enough using the command line tool:

$ heroku run flying-sphinx configure /path/to/sphinx.conf

If you want to run this command on your own machine instead, please note that you’ll need the FLYING_SPHINX_IDENTIFIER and FLYING_SPHINX_API_KEY environment variables set with the values your Heroku application has set.

Additional Configuration Files

If you’re using additional configuration files for wordforms, stopwords, or exceptions, please refer to the extended documentation for either Python or Node.js.

Configuration with Rails

The flying-sphinx gem can generate configuration via Thinking Sphinx - so, don’t provide a file path:

$ heroku run bundle exec flying-sphinx configure

Additional configuration files are handled by the gem automatically.

Configuration with other Ruby frameworks

While the flying-sphinx executable will load a Rails app when required, it can’t predict the correct loading approach for non-Rails apps. You’ll need to use the rake task instead, which both configures and indexes your data:

$ heroku run rake fs:index

Like the executable, any additional configuration files are handled by the gem automatically.

Sphinx Versions

You can choose which version of Sphinx you’d like to use through the version setting in your config/sphinx.yml file. If nothing is specified you’ll see this warning (but everything will work fine), and it’ll default to 2.0.4:

Sphinx cannot be found on your system. You may need to configure the
following settings in your config/sphinx.yml file:
  * bin_path
  * searchd_binary_name
  * indexer_binary_name

For more information, read the documentation:
http://freelancing-god.github.com/ts/en/advanced_config.html

The reason for this is that Heroku doesn’t know what version of Sphinx you’re using - but we can tell Thinking Sphinx (and Flying Sphinx), and that hides this message. Add (or edit) your config/sphinx.yml file to include the version setting for your production environment:

production:
  version: '2.0.6'

Processing Sphinx Indices

To tell Flying Sphinx to process your Sphinx indices, it’s just a single call to the command line:

$ heroku run bundle exec flying-sphinx index

If you just want to process specific indices, add them as additional arguments:

$ heroku run bundle exec flying-sphinx index articles users

Omit the bundle exec if you’re not using Ruby.

Indexing is something you’ll want to do at a regular interval. Heroku’s Scheduler Add-on will do the trick nicely - just use flying-sphinx index as the task.

Controlling the Daemon

Starting and stopping the Sphinx daemon is done through the command line with two simple commands:

$ heroku run bundle exec flying-sphinx start
$ heroku run bundle exec flying-sphinx stop

Omit the bundle exec if you’re not using Ruby.

You can also use the restart command to stop and then start the daemon, and the rebuild command to stop Sphinx, process the indices, and start Sphinx up again:

$ heroku run bundle exec flying-sphinx restart
$ heroku run bundle exec flying-sphinx rebuild

(Again, omit the bundle exec if you’re not using Ruby.)

Searching

Once you’ve got some indexed data and have started the daemon, you can then send search queries to Sphinx. The Sphinx server and port are available through the environment variables FLYING_SPHINX_HOST and FLYING_SPHINX_PORT. Use them through whichever Sphinx client you prefer in your language.

If you’re using Ruby and Thinking Sphinx, then the flying-sphinx gem manages all of that for you in the background, so just run your search calls as you normally would.

Delta Indexing (Ruby Only)

The only indexing approach that’s currently viable in combination with Heroku is a variation upon the Delayed Deltas. You’ll need to use Delayed Job or Resque, and that means having at least one Heroku worker running. Yes, that means spending a bit more money per month, but them’s the breaks I’m afraid.

Make sure you’re using ts-delayed-delta 2.0.0 or newer. If you’re using Thinking Sphinx v1 or v2, the delta setup in your define_index block should look something like this:

define_index do
  # Fields, Attributes, etc...

  set_property :delta => :delayed
end

If you’re using Thinking Sphinx v3, then deltas are set at the top of the index definition:

ThinkingSphinx::Index.define :article, :with => :active_record, :delta => ThinkingSphinx::Deltas::DelayedDelta do
  # Fields, Attributes, etc
end

If you’re using Resque instead of Delayed Job, then you’ll need the ts-resque-delta gem, and set :delta to FlyingSphinx::ResqueDelta (this is currently only supported with Thinking Sphinx v1/v2).

Don’t forget to add a column called delta to your model as well — just like with the standard delta approaches.

Once you’ve got this deployed, then rebuild your Sphinx setup:

$ heroku run rake fs:rebuild

And from that point, the rest is taken care of, provided you have a worker managing your Delayed Job or Resque queue.

Limitations (Ruby Only)

The only known limitation across the standard Thinking Sphinx features is that you can’t currently use facets built upon string or text columns. This is because Sphinx doesn’t understand string attributes (and each facet is an attribute), so Thinking Sphinx actually stores the corresponding CRC32 value for each string.

However, PostgreSQL doesn’t have a native CRC32 function, so Thinking Sphinx usually adds one. This limitation also applies if you’re using any SQL snippets that reference the CRC32 function. We’re currently investigating some ways of working around this, and the documentation will be updated accordingly.

However, if you’re using Amazon RDS, then this won’t be a problem at all.

Upgrading and Downgrading

You can upgrade and downgrade plans just by informing Heroku you want to use a different plan:

$ heroku addons:upgrade flying_sphinx:granite
$ heroku addons:downgrade flying_sphinx:ceramic

Flying Sphinx will migrate your data between plans and update your app accordingly - depending on how much data you have, this could take several minutes. The owner of the app will be emailed once the plan change is complete.