Related Search

Overview

Related Search is a feature designed to automatically identify and present search terms that are related based on the similarity of the results they produce. When enabled, it enhances the user experience by surfacing additional, relevant queries, allowing users to discover related content and refine their searches more effectively.

Data Generation and Preparation

Database Creation

The process begins with the creation of an internal dataset derived from recent search activity. This dataset focuses on searches that produce strong results while excluding any queries that are intentionally marked to be omitted. Searches are grouped using shared identifiers and analyzed on a daily basis to ensure consistency and relevance.

Data Filtering and Storage

During processing, keywords with counts below a defined threshold (See: Threshold Configuration) are removed to maintain data quality. The system then generates summarized records that capture key attributes such as the database identifier, search date, primary keys, associated keywords, and their counts. These summaries are stored for downstream use in generating related search results.

Related Search Generation

Aggregation Process

To generate related searches, the system analyzes search summaries from the most recent thirty-day period. It iterates through each summary, aggregating keywords under their corresponding primary search terms. Existing entries are updated with new keyword data, while new primary keys are added as they are discovered.

Threshold Filtering

After aggregation, the system applies the same threshold filter to remove keywords that do not meet the minimum frequency requirements. This ensures that only meaningful and relevant relationships are retained in the final dataset.

Output Structure

Data Transformation

The aggregated data is converted into a simplified structure that maps each primary search term to its associated related keywords. This transformation enables efficient retrieval and downstream usage by indexing systems.

Final Output Format

The final output is represented as a dictionary where:

  • The key corresponds to the primary search term.
  • The value is an ordered list of related keywords sorted in descending order of frequency, with count metadata removed.

Parameters and Constraints

Threshold Configuration

The related search threshold governs the number of similar items needed for a search to be deemed related to the original query. This value defaults to 1, but it can be configured and tweaked to your liking. To change this number, navigate into the admin page and click on System Parameters --> Reports Configuration. There, you'll be able to change the threshold for Related Search via the following parameter: Threshold for # of searches for Related Searches.


Data Scope Limitations

  • Related search data is derived only from the past thirty days and is not currently configurable.
  • While there is a practical limit on how many primary keys a keyword can appear under, there is no strict limit on the number of keywords associated with a single primary key.

Summary

Related Search leverages recent search behavior to identify meaningful relationships between queries. Through structured data processing, filtering, and aggregation, it produces a streamlined mapping of primary terms to relevant related keywords, improving search discovery and overall user experience.