Typeahead.js, Elasticsearch and Rails!

Recently for a project I was required to take a list of keywords and make an auto suggest feature. Typically, this can be achieved through some simple but crude sql.

Since the project already required Elasticsearch, all the records were already neatly indexed for me. (minus the keywords) Additionally, the project was developed in Rails so I already had the benefit of the elasticsearch-rails project. With that in mind I got started…

Creating an Analyzer

The first step was to create an analyzer in Elasticsearch for our typeahead column. Creating new analyzers is pretty straightforward with the elasticsearch-rails gem. The “keyword” data we were getting was not checked for errors, so in our case we wanted the analyzer to lowercase everything to reduce the suggestions later. Example if there were searches for “Cats” and “cats” in our keywords we didn’t want to return two suggestions.

Creating a mapping

Now that the analyzer is setup, a mapping can be created. Within the mapping an index is created that uses the keyword info, but this is where it get’s kinda goofy. Elasticsearch wants this data to be structured in a hash with the key of “input” which in turn contains an array of the different keywords. See the example below for more details. The final thing to note is that the index is assigned our “typeahead” analyzer that was created above for both index analysis and search analysis.

Querying suggestions

Now that both the analyzer and mapping are setup, the data can be re-indexed and a query can be run. The following will run a “suggestion” query within the elasticsearch-rails gem. Note: the “text” key is the search term, while the “field” that is being searched on is the keyword_suggest field that was defined in the mapping above.

If all goes according to plan, our model should be returning keyword results in the format below…

Controller time

The controller to return json from the model is dead simple.

Displaying the results

Now that the controller is returning suggestions as json, it’s time to wire that up to a user interface. The project was using the popular Bootstrap framework. With version 2.3.2 there was a JavaScript plugin for typeahead functionality. However, this plugin has been removed as of 3.x in favor of using the typeahead.js library from twitter. It turns out, with a little effort typeahead isn’t that much more complicated to setup than the old bootstrap plugin.

First, an input tag is defined. Make sure that the autocomplete attribute is set to “off” so the native browser doesn’t kick in with it’s suggestions.

Typeahead.js uses an engine called bloodhound to make ajax calls as you type. It has some intelligent caching features and makes consuming that data pretty hands off. Once a new bloodhound object is created, the initialize() method must be called to finish the process.

Now that the bloodhound engine is setup, the input tag can be selected and typeahead will wire everything together. One thing to note, the “displayKey” setting is telling typeahead which key to use from our hash values within the array. In this case “text” is the key that should be used.

Going out with style

The last thing that I did was add a bit of custom styling to the typeahead box.

Overall I was pretty impressed by the solution. It look less than a couple hours to setup and was pretty fast and responsive. When considering a typeahead option for your next project give Elasticsearch a spin!

COPY millions of rows to Postgresql with Rails

ActiveRecord is great when you need to quickly access and manipulate a few rows within a database. Loading records into a database is just as easy… Seed files and custom rake tasks make inserting records a breeze. But what happens when you need to import lots of rows. To clarify, when I say lots of rows, I’m talking about millions of rows from a delimited text file.

So many rows, so little time

My first instinct was to read in the text file and do a Model.create() on each row. This took a long, long time. (I actually gave up on it)

Next, I tried batching the rows to an array in an effort limit the number of database calls. I then imported each batch using the activerecord-import gem. Batching helps if your recordset is a couple thousand rows, but doesn’t scale efficiently to a million+ rows.

COPY to the rescue

In order to get in the millions, I needed to take Rails out of the equation as much as possible. This meant using an ActiveRecord raw_connection and the COPY command from postresql.

Due to permission issues, you most likely will not be able to use the filename option of the COPY command. This actually turns out to not be a big deal since you can still use STDIN to pass data to the command. Confused? Let’s look at the example

It’s actually pretty simple. I execute the COPY command on our large_table with data from STDIN. Then I read in the file and put the data into STDIN for each line. When the file is finished, I issue an end copy instruction.

Conclusion

With the copy technique above, I was able to import over 2.4 million rows in less than 4 min. Not too shabby. I would be interested in hearing what strategies you all have used for large imports.

Cutting circles in your HTML with CSS3

Masking images on the web has typically been a job reserved for Adobe Flash or some equivalent photo manipulation tool as a static image. While Flash was prevalent on the web, it became really easy to mask an image with a vector shape. Now that Flash content has been largely replaced with a combination of HTML, CSS3 and Javascript, it is time to dig back into the toolbox and find new solutions for old problems.

Screen-Shot-2013-10-08-at-2.47.29-PM

Masking Techniques

Making a mask for an image requires some sort of shape to use as the “mask”. This will determine what portions of your image will “show through” or be excluded. Typically a vector shape is used to make a mask. On the web, SVG (Scaleable Vector Graphics) is used to create these vector graphics. All the different ways to mask an image are outside of the scope of this article, however one technique that is illustrated in the next section does not require SVG at all. For more detailed information on the different ways to mask content with SVG check out this article

Masking with Radial Gradients

After a slight detour, it is time to get back to the task at hand. Cutting a transparent circle in a DOM element. As we have seen above, we can cut out all sorts of shapes using SVG. By layering a radial gradient over the top of a particular element we can achieve a similar effect using a circle. Check out the fiddle below for more details.

http://jsfiddle.net/HAKuS/1/

To create the illusion of a cutout circle we use the color-stops property of radial gradients. In the background-image style property the first color-stop is a 100px transparent color. This creates our “hole” in the white overlay. The next color-stop creates a 5px inner shadow of semi-transparent black. The final color-stop creates the background color for the rest of the DOM element. And volia, there is a masked hole in your DOM element with no SVG!

Screen-Shot-2013-10-10-at-9.17.01-AM

1 2 3 14