Semantic search with Ruby on Rails

Learn how to implement semantic search in Ruby on Rails using the Neighbor gem, Anthropic's Claude API for summarization, and OpenAI for text embeddings. Enhance your app's search capabilities with meaning-based results.

Robert Rossprofile image

By Robert Ross on 7/30/2024

Introduction

Semantic search is a powerful technique that allows you to find records in your database based on the meaning of text, rather than exact keyword matches. This can greatly enhance the search capabilities of your application, providing more relevant results to your users. In this post, we'll walk through how to implement semantic search in a Ruby on Rails application using the Neighbor gem, Anthropic's Claude API for summarization, and OpenAI for text embeddings.

Our Stack

We'll be using the following technologies:

  • A brand new Ruby on Rails 7.1 (The Neighbor gem is compatible with earlier versions of Rails)

  • PostgreSQL with the pgvector extension

  • Neighbor gem for easy vector operations w/ ActiveRecord

  • Anthropic's Claude API for summarization

  • OpenAI's API for text embeddings

How does this work?

When storing data that can be searched on via text embeddings, we need to create a standard prompt to feed an LLM to generate a summary of the content. Second, we’re going to pass that summary to a text embedding model to generate an array of floats (think of it as converting text to numbers). Lastly, we’re going to store those embeddings in our database.

At a high level:

  • Create a summary of an incident via Anthropic’s API.

  • Send that summary to OpenAI’s text embedding API to retrieve an embedding.

  • Store the embedding on our ActiveRecord model (we’re going to use a simple Incident model)

Let’s get started.

Step 1: Setting Up

First, add the Neighbor, OpenAI, and Faraday gems to your Gemfile:

bundle add neighbor ruby-openai faraday

Next, we need to choose an extension. Neighbor supports two: cube and vector. We'll use vector as it supports more dimensions and approximate nearest neighbor search.

Install pgvector in your PostgreSQL database, then run:

rails generate neighbor:vector
rails db:migrate

This sets up the necessary database extensions.

Step 2: Creating the Model

Let's create an Incident model to store our IT incidents:

rails generate model Incident name:string description:text resolution:text summary:text embedding:vector{1536}

This creates a migration that includes a text column for our summary and a vector column for our embeddings. The 1536 specifies the number of dimensions, which matches OpenAI's text embedding model output.

Run the migration:

rails db:migrate

Now, update the Incident model to use Neighbor:

class Incident < ApplicationRecord
  has_neighbors :embedding
end

Step 3: Implementing the Anthropic Summarizer

I recommend creating a PORO (plain old ruby object) that is responsible for generating a consistent system and user prompt. By prompting an LLM with the same system prompt, and formatted incident information, we’ll get a more consistent format when we retrieve the summary blob of text.

# app/services/incident_summarizer.rb
class IncidentSummarizer
  def initialize(incident)
    @incident = incident
  end

  def system_prompt
    "You are an expert at summarizing IT incidents. Create a brief summary of the incident described below."
  end

  def user_prompt
    <<~PROMPT
      Incident: #{@incident.name}
      Description: #{@incident.description}
      Resolution: #{@incident.resolution}
    PROMPT
  end
end

Next, we’re going to create a small class that is responsible for sending the system and user prompt from our class above to Anthropic’s API to generate a summary of our incident. The Anthropic Claude API is very straightforward, and we haven’t discovered a need for a Ruby gem for it.

# app/services/anthropic_client.rb
require 'faraday'
require 'json'

class AnthropicClient
  ANTHROPIC_API_URL = "https://api.anthropic.com/v1/messages"

  def initialize(api_key)
    @api_key = api_key
  end

  def summarize(system_prompt, user_prompt)
    response = connection.post do |req|
      req.body = {
        model: "claude-3-haiku-20240307",
        max_tokens: 1000,
        system: system_prompt,
        messages: [{ role: "user", content: user_prompt }]
      }.to_json
    end

    JSON.parse(response.body)["content"][0]["text"]
  end

  private

  # https://docs.anthropic.com/en/api/messages
  def connection
    Faraday.new(url: ANTHROPIC_API_URL) do |f|
      f.headers['Content-Type'] = 'application/json'
      f.headers['x-api-key'] = @api_key
      f.headers['anthropic-version'] = '2023-06-01'
    end
  end
end

Step 4: Generating Embeddings

We'll use OpenAI's API to generate embeddings. Let's create a service to handle this similar to our Anthropic client previously.

# app/services/openai_client.rb
require 'net/http'
require 'json'

class OpenAIClient
  OPENAI_API_URL = "https://api.openai.com/v1/embeddings"

  def initialize(api_key)
    @api_key = api_key
  end

  def generate_embedding(text)
    response = connection.post do |req|
      req.body = {
        model: "text-embedding-3-small",
        input: text
      }.to_json
    end

    JSON.parse(response.body)["data"][0]["embedding"]
  end

  private

  def connection
    Faraday.new(url: OPENAI_API_URL) do |f|
      f.headers['Content-Type'] = 'application/json'
      f.headers['Authorization'] = "Bearer #{@api_key}"
    end
  end
end

ℹ️ Note: You’ll need to add an inflection in Rails so that this class will load correctly:

# config/initializers/inflections.rb

ActiveSupport::Inflector.inflections(:en) do |inflect|
  inflect.acronym "OpenAI"
end

Step 5: Storing Summaries and Embeddings

We’re going to update our Incident model with a few simple methods that summarize the incident, grab the embeddings for that summary, and store it in our vector column in Postgres. In this tutorial we’re using Rails credentials, which if you’re following in a dummy Rails application you can modify with:

bin/rails credentials:edit

The format in the file is:

anthropic:
  api_key: sk-ant-api03-key

openai:
  api_key: sk-open-api-key

Our incident model gets a few simple methods to process our incident later for semantic searching.

class Incident < ApplicationRecord
  has_neighbors :embedding

  # This method is responsible for generating a summary of the incident and storing it in the database.
  # It uses the IncidentSummarizer and AnthropicClient classes to generate the summary.
  def generate_summary
    summarizer = IncidentSummarizer.new(self)
    anthropic_client = AnthropicClient.new(Rails.application.credentials.anthropic[:api_key])
    summary = anthropic_client.summarize(summarizer.system_prompt, summarizer.user_prompt)
    update(summary: summary)
  end

  # This method is responsible for generating an embedding of the incident summary and storing it in the database.
  # It uses the OpenAIClient class to generate the embedding. The embedding column is basically an array of floats.
  def generate_and_store_embedding
    openai_client = OpenAIClient.new(Rails.application.credentials.openai[:api_key])
    embedding = openai_client.generate_embedding(summary)
    update(embedding: embedding)
  end

  # This method is responsible for processing the incident by generating a summary and an embedding in sequence.
  def process_incident
    generate_summary
    generate_and_store_embedding
  end
end

Step 6: Finding Similar Incidents

Thanks to Neighbor, finding similar incidents is now very simple:

# Get the nearest neighbors to a record
incident = Incident.first
similar_incidents = incident.nearest_neighbors(:embedding, distance: "cosine").first(5)

# Get the nearest neighbors to a vector
similar_incidents = Incident.nearest_neighbors(:embedding, incident.embedding, distance: "cosine").first(5)

Testing with Sample Data

Let's create some sample data to test our semantic search. Add the following to your db/seeds.rb file:

# db/seeds.rb

# Clear existing incidents
Incident.destroy_all

# Create sample incidents
incidents = [
  {
    name: "Database Performance Issue 1",
    description: "Users reported slow response times when querying the database. Investigation showed high CPU usage on the database server.",
    resolution: "Optimized slow-running queries and added appropriate indexes to improve performance."
  },
  {
    name: "Network Outage",
    description: "All services became unreachable due to a network failure. Investigation showed a core router had failed.",
    resolution: "Replaced the faulty router and restored network connectivity. Implemented redundant routing to prevent future single points of failure."
  },
  # Add more incidents here...
]

# Create incidents, generate summaries and embeddings
incidents.each do |incident_data|
  incident = Incident.create!(incident_data)
  incident.process_incident
  puts "Processed incident: #{incident.name}"
end

puts "Seed data created successfully!"

Run the seeds with:

rails db:seed

Using the Semantic Search

Now you can use your semantic search like this:

# Find an incident to use as a reference
reference_incident = Incident.find_by(name: "Database Performance Issue 1")

# Find similar incidents
similar_incidents = reference_incident.nearest_neighbors(:embedding, distance: "cosine").first(5)

puts "Incidents similar to '#{reference_incident.name}':"
similar_incidents.each do |similar_incident|
  puts "- #{similar_incident.name} (Distance: #{similar_incident.neighbor_distance.round(2)})"
  puts "  Summary: #{similar_incident.summary}"
end

Indexing for Better Performance

For better performance with large datasets, you can add an index. Create a migration with:

bin/rails g migration AddIndexToIncidentsEmbedding

And then add the index to the generated file:

class AddIndexToIncidentsEmbedding < ActiveRecord::Migration[7.1]
  def change
    add_index :incidents, :embedding, using: :hnsw, opclass: :vector_cosine_ops
  end
end

Run the migration:

rails db:migrate

Conclusion

Adding semantic search to applications has become remarkably easy with the new APIs and libraries available to developers in the last two years. Semantic search, when used correctly, is a potentially more powerful way to display similar content to users – like how we store incident summaries and recommend them to users in FireHydrant.

See FireHydrant in action

See how our end-to-end incident management platform can help your team respond to incidents faster and more effectively.

Get a demo