Site Search Personalization

Purpose and Integration Options

Personalized Search improves relevance by tailoring results to individual user preferences, increasing engagement and conversion.

Raptor Site Search Personalization can be implemented in two ways:

With Raptor’s hosted Typesense solution (default)
With the customer’s own search engine, provided it supports vector similarity / vector search

This document first explains how personalization works in Typesense, then describes how to integrate with external search engines.

A) Personalized Site Search with Typesense

1) Non-Personal Ranking (No Personalization)

Without personalization, results are typically ranked by:

Text relevance (keyword match)
RaptorScore (global popularity / general ranking)

In practice, many results share the same text relevance score. In those cases, RaptorScore determines the order within that group.

2) RaptorScore

RaptorScore is a non-personal ranking signal, typically driven by overall product/content popularity, performance, and Merchandising boost.

It is controlled by enabling:

UpdateSearchIndexRank (e-commerce sites)
UpdateSearchIndexRankContent (content sites)

The configuration of these modules determines sorting within the same text relevance group.

3) Personalized Ranking (With Personalization)

When personalization is enabled, keyword relevance remains the primary filter. Personalization only affects ordering within a group of results that share the same text relevance:

Typesense computes a vector similarity score between the user vector and each item vector.
Items that meet the personalization threshold are promoted and ranked by vector score.
Items below the threshold fall back to ranking by RaptorScore.

Effective behavior:

Text relevance group
├── Personally relevant items (sorted by vector score)
└── Remaining items (sorted by RaptorScore)

Threshold Configuration

The personalization threshold is not fixed by Raptor. It is defined by the implementation partner in each query sent to Typesense. This allows partners to tune the balance between personalization strength and keyword relevance based on their business requirements.

4) Recommended Implementation Flow (Typesense)

To avoid adding latency at search time:

1. When the user becomes known:

Fetch the latest user embedding from Raptor’s user-embedding endpoint.

2. Cache the user embedding:

Typically for the duration of the session.

3. At search time:

Send keyword + cached user embedding to Typesense, using a query configuration that prioritizes:

- text relevance first

- then vector score (where applicable)

- otherwise RaptorScore

🔍 Note:

Fetching, caching, and sending user embeddings is the responsibility of the implementation partner.
Invalidate cached embeddings when the user logs out or after a configurable time window to avoid stale personalization.

⚠️ Warning: Ensure embedding sync jobs run during off-peak hours to avoid performance impact.

5) Model Updates and Versioning

5.1 Real-Time Behavior

Raptor processes behavioral data in real time. A fetched user embedding reflects the latest known behavior at the time of retrieval.

5.2 Weekly Retraining + Nightly Rollout

We retrain/recompute the model approximately once per week to ensure:

New items receive embeddings.
Updated behavioral patterns are reflected in item-to-item relationships.

After retraining:

Item embeddings are updated in the search index.
The user-embedding endpoint is updated to produce user embeddings for the new model version.

🔍 Note: Rollouts happen overnight to ensure consistency between user embeddings and item embeddings. This prevents mismatches that could degrade personalization quality, as both embeddings must originate from the same model version.

6) Model Configuration Recommendations

1. Training signal for item similarity:

Use 'visit' tracking so the model learns which items/content are similar based on actual behavior.

2. Scope (cookie/session):

Use cookie scope (optionally session scope) to place a natural time boundary on behavioral context per user.

3. Sparse data (often B2B):

Extend the model with metadata such as category path and brand to provide additional similarity signals.

4. Many users (often B2C):

Avoid adding metadata, as behavioral data is often sufficient and metadata can introduce noise or bias.

B) Integration with External Search Engines (Without Typesense)

Many search engines now support vector similarity as part of their standard feature set. In this setup, Raptor can expose:

Item embeddings via API (to be stored in the customer’s search index).
User embeddings via API (to personalize per user).

Key Principle

Regardless of the approach, the keyword query must remain the gatekeeper for what is eligible to appear. Highly personalized items should not surface if they do not match the query intent/terms.

Ranking Strategies (External Search)

Two common approaches:

1. Hierarchical ranking (like Typesense):

text relevance → personalization within the text group → fallback to global ranking.

2. Blended scoring:

Combine text relevance and personalization, with guardrails ensuring text relevance remains a hard filter or minimum requirement.

Embedding Sync and Model Version Alignment

When integrating into an external search engine, you must:

Fetch item embeddings and load them into your search index.
Periodically check for a newer model version (updated embeddings).
After updating item embeddings in your index, call a Raptor endpoint so we switch the user-embedding endpoint to the corresponding model version.

This ensures user embeddings returned in real time always match the item-embedding version stored in your index.

💡 Tip: Consider automating version checks and updates via scheduled jobs.

Personalized Search is a powerful way to improve relevance and user experience, but it is not a “set-and-forget” feature.

To achieve the best results, expect an iterative process that includes:

- continuous monitoring,

- periodic model updates, and

- small configuration adjustments over time.

These refinements ensure personalization remains aligned with evolving user behavior and business goals.

With proper implementation and ongoing optimization, personalization can deliver significant gains in engagement and conversion.