Using semantic search to help viewers navigate long-form videos faster

Using semantic search to help viewers navigate long-form videos faster

Independent UX study, 2025

Independent UX study, 2025

About this case study

About this case study

This case study explores how video semantic search can be integrated into streaming platforms like YouTube to help users find information within videos more quickly.

This case study explores how video semantic search can be integrated into streaming platforms like YouTube to help users find information within videos more quickly.

HOW THINGS BEGAN

HOW THINGS BEGAN

Ever spent time scrubbing through a video just to find that one part you were searching for?

Ever spent time scrubbing through a video just to find that one part you were searching for?

My observation: users struggle to locate specific information within long-form videos

My observation: users struggle to locate specific information within long-form videos

A lecture on YouTube had what I needed, but I struggled to locate the exact section, scrolling back and forth just to find the right part.

A lecture on YouTube had what I needed, but I struggled to locate the exact section, scrolling back and forth just to find the right part.

Features to improve video navigation exist, but are limited in precision & accessibility

Features to improve video navigation exist, but are limited in precision & accessibility

With the growth of podcasts and other long-form content, YouTube introduced new video player features to improve content navigation. But these still come with limitations.

With the growth of podcasts and other long-form content, YouTube introduced new video player features to improve content navigation. But these still come with limitations.

Method 1: Using Timestamps and Chapters to find relevant sections

Method 1: Using Timestamps and Chapters to find relevant sections

How does this feature fall short?

How does this feature fall short?

Requires significant effort from publishers to implement effectively, if at all

Requires significant effort from publishers to implement effectively, if at all

Beginners or smaller channels often skip this step or create imprecise timestamps because of the work involved

Beginners or smaller channels often skip this step or create imprecise timestamps because of the work involved

Even with timestamps, sections are broad (10–15 minutes) and lack micro-level granularity

Even with timestamps, sections are broad (10–15 minutes) and lack micro-level granularity

When poorly implemented by publishers, users are still left scrubbing through large sections

When poorly implemented by publishers, users are still left scrubbing through large sections

Second Way: Transcript based search

Second Way: Transcript based search

Users fail to find relevant moments if they don’t match the transcript’s exact wording

Users fail to find relevant moments if they don’t match the transcript’s exact wording

Only matches exact phrases, failing to account for conceptually related content

Only matches exact phrases, failing to account for conceptually related content

How do users approach information search? I interviewed them to find out

How do users approach information search? I interviewed them to find out

Affinity mapping user insights to identify behavior patterns in viewers:

Affinity mapping user insights to identify behavior patterns in viewers:

A PATTERN THAT STOOD OUT

Habituation to AI search has lowered user’s tolerance for manual, slow information retrieval.

Habituation to AI search has lowered user’s tolerance for manual, slow information retrieval.

I use Notebook LLM to search within the subtitles, or ask Perplexity to open the video to the relevant part

I use Notebook LLM to search within the subtitles, or ask Perplexity to open the video to the relevant part

Users leverage AI tools to bypass manual video navigation, prioritizing speed in finding specific data

Users leverage AI tools to bypass manual video navigation, prioritizing speed in finding specific data

If it takes too long to find the part that answers my doubt, I skip the video and ask ChatGPT instead

If it takes too long to find the part that answers my doubt, I skip the video and ask ChatGPT instead

Users don’t patience for slow information retrieval and will prioritize faster alternatives (instant gratification)

Users don’t patience for slow information retrieval and will prioritize faster alternatives (instant gratification)

AI-powered tools have recalibrated what users consider “acceptable speed” for content discovery


Instant search results are becoming a baseline expectation, and activities requiring sustained attention are increasingly seen as frustrating.

AI-powered tools have recalibrated what users consider “acceptable speed” for content discovery


Instant search results are becoming a baseline expectation, and activities requiring sustained attention are increasingly seen as frustrating.

OPPORTUNITY AREA

OPPORTUNITY AREA

Users need a quicker and more context-aware method to efficiently locate specific information within long videos

Users need a quicker and more context-aware method to efficiently locate specific information within long videos

Idea: Using video semantic search, to help users find relevant sections within content

Idea: Using video semantic search, to help users find relevant sections within content

Video semantic search allows for context-aware retrieval of video segments based on natural language queries.


I first learned about it when Tyler Angert, founder of Patina Systems, noted that the technology is being developed, but it remains largely unexplored in products.

Video semantic search allows for context-aware retrieval of video segments based on natural language queries.


I first learned about it when Tyler Angert, founder of Patina Systems, noted that the technology is being developed, but it remains largely unexplored in products.

Example of Semantic Search implemented across Amazon Web Services for faster retrieval and search of videos in large databases.

Example of Semantic Search implemented across Amazon Web Services for faster retrieval and search of videos in large databases.

While the tech is still evolving, it can guide the design of an ideal video experience as user expectations grow in an AI world

While the tech is still evolving, it can guide the design of an ideal video experience as user expectations grow in an AI world

How can we incorporate it in streaming platforms like YouTube?

How can we incorporate it in streaming platforms like YouTube?

Approach 1: Video search card within the video player

Approach 1: Video search card within the video player

In-player overlay that allows user to search using keywords, which highlights relevant video segments on the timeline

In-player overlay that allows user to search using keywords, which highlights relevant video segments on the timeline

A floating overlay with a search field lets users enter a query, post which all relevant sections get highlighted in the video progress bar.

A floating overlay with a search field lets users enter a query, post which all relevant sections get highlighted in the video progress bar.

PROS

PROS

Keeps search within the video frame, user need not exit

Keeps search within the video frame, user need not exit

Timeline highlights give quick visual cues of location

Timeline highlights give quick visual cues of location

Allows for fast navigation and search

Allows for fast navigation and search

CONS

CONS

If there are too many results, will clutter the progress bar

If there are too many results, will clutter the progress bar

Limited info is displayed about the search results

Limited info is displayed about the search results

Approach 2: Side panel outside the video player frame

Approach 2: Side panel outside the video player frame

A side panel lists all relevant sections and timestamps based on the user’s query

A side panel lists all relevant sections and timestamps based on the user’s query

The search field trigger can be accessed from the video’s description panel, alongside other similar features

The search field trigger can be accessed from the video’s description panel, alongside other similar features

PROS

PROS

Same UX pattern of similar features (transcripts, chapters)

Same UX pattern of similar features (transcripts, chapters)

Easier to browse through a list of results

Easier to browse through a list of results

CONS

CONS

Discoverability friction, feature is nested in another panel

Discoverability friction, feature is nested in another panel

Extra interaction: user needs to exit the player frame

Extra interaction: user needs to exit the player frame

Approach 3: Paginated side panel with sections and their transcripts

Approach 3: Paginated side panel with sections and their transcripts

Emphasizes depth in search results, showing transcript excerpts with timestamps so users can scan text

Emphasizes depth in search results, showing transcript excerpts with timestamps so users can scan text

PROS

PROS

Detailed info about search results before viewing them

Detailed info about search results before viewing them

CONS

CONS

Extra interaction to navigate results, compared to a list

Extra interaction to navigate results, compared to a list

Slower approach: evaluating results takes more time

Slower approach: evaluating results takes more time

Which approach is best aligned with viewer objectives?

Which approach is best aligned with viewer objectives?

Evaluating all three iterations across different user goals

Evaluating all three iterations across different user goals

To ensure the design decision aligns with user preferences, I evaluated all three approaches across criteria derived from insights gathered during user interviews

To ensure the design decision aligns with user preferences, I evaluated all three approaches across criteria derived from insights gathered during user interviews

Primary user goal: Speed and convenience of information retrieval

Primary user goal: Speed and convenience of information retrieval

Approach 1 scores highest across Scannability and Ease of Access, making it most aligned towards the user goal of speed

Approach 1 scores highest across Scannability and Ease of Access, making it most aligned towards the user goal of speed

While detailed information can be helpful, it slows retrieval and increases cognitive load, making information density secondary to users priority of effortless navigation.

While detailed information can be helpful, it slows retrieval and increases cognitive load, making information density secondary to users priority of effortless navigation.

Accounting for edge cases in implementation

Accounting for edge cases in implementation

What if there are too many thematically related results?

What if there are too many thematically related results?

Solution: Introducing a ‘Search Scope’ Slider

Solution: Introducing a ‘Search Scope’ Slider

A user-adjustable control that lets users decide how strict or broad the semantic matching should be

A user-adjustable control that lets users decide how strict or broad the semantic matching should be

To wrap up, here are a few reflections

To wrap up, here are a few reflections

AI driven habits are reshaping user expectations and their consumption patterns

AI driven habits are reshaping user expectations and their consumption patterns

As AI tools become more intuitive, users increasingly expect experiences that are immediate, relevant, and context-aware

As AI tools become more intuitive, users increasingly expect experiences that are immediate, relevant, and context-aware

Exploring speculative solutions sets the foundation of what future experiences can look like once the tech matures

Exploring speculative solutions sets the foundation of what future experiences can look like once the tech matures

Iterating without tech constraints helps establish the ideal experience, which can be adjusted based on feasibility & trade offs

Iterating without tech constraints helps establish the ideal experience, which can be adjusted based on feasibility & trade offs

Using semantic search to help viewers navigate long-form videos faster

Independent UX Study, 2025

About this case study

This case study explores how video semantic search can be integrated into streaming platforms like YouTube to help users find information within videos more quickly.

HOW THINGS BEGAN

Ever spent time scrubbing through a video just to find that one part you were searching for?

My observation: users struggle to locate specific information within long-form videos

A lecture on YouTube had what I needed, but I struggled to locate the exact section, scrolling back and forth just to find the right part.

Features to improve video navigation exist, but are limited in precision & accessibility

With the growth of podcasts and other long-form content, YouTube introduced new video player features to improve content navigation. But these still come with limitations.

Method 1: Using Timestamps and Chapters to find relevant sections

How does this feature fall short?

Tools don’t link data with contextual key indicators needed to take action

Ex: Vesting charts track equity over time, but exclude scenario-specific factors such as expiry dates or no-exercise windows.

Users miss how interconnected factors shape outcomes. What-if tools focus on only single metrics

Scenario modelling tools let you tweak numbers separately, but not judge the combined impact of different variables.

Second Way: Transcript based search

Users fail to find relevant moments if they don’t match the transcript’s exact wording

Only matches exact phrases, failing to account for conceptually related content

How do users approach information search? I interviewed them to find out

Affinity mapping user insights to identify behavior patterns in viewers:

A PATTERN THAT STOOD OUT

Habituation to AI search has lowered user’s tolerance for manual, slow information retrieval.

I use Notebook LLM to search within the subtitles, or ask Perplexity to open the video to the relevant part

Users leverage AI tools to bypass manual video navigation, prioritizing speed in finding specific data

If it takes too long to find the part that answers my doubt, I skip the video and ask ChatGPT instead

Users don’t patience for slow information retrieval and will prioritize faster alternatives (instant gratification)

OPPORTUNITY AREA

Users need a quicker and more context-aware method to efficiently locate specific information within long videos

Idea: Using video semantic search, to help users find relevant sections within content

Video semantic search allows for context-aware retrieval of video segments based on natural language queries.


I first learned about it when Tyler Angert, founder of Patina Systems, noted that the technology is being developed, but it remains largely unexplored in products.:

Example of Semantic Search implemented across Amazon Web Services for faster retrieval and search of videos in large databases.

While the tech is still evolving, it can guide the design of an ideal video experience as user expectations grow in an AI world

How can we incorporate it in streaming platforms like YouTube?

Approach 1: Video search card within the video player

In-player overlay that allows user to search using keywords, which highlights relevant video segments on the timeline

A floating overlay with a search field lets users enter a query, post which all relevant sections get highlighted in the video progress bar.

PROS

Keeps search within the video frame, user need not exit

Timeline highlights give quick visual cues of location

Allows for fast navigation and search

CONS

If there are too many results, will clutter the progress bar

Limited info is displayed about the search results

Approach 2: Side panel outside the video player frame

A side panel lists all relevant sections and timestamps based on the user’s query

The search field trigger can be accessed from the video’s description panel, alongside other similar features

PROS

Same UX pattern of similar features (transcripts, chapters)

Easier to browse through a list of results

CONS

Discoverability friction, feature is nested in another panel

Extra interaction: user needs to exit the player frame

Approach 3: Paginated side panel with sections and their transcripts

Emphasizes depth in search results, showing transcript excerpts with timestamps so users can scan text

PROS

Detailed info about search results before viewing them

CONS

Extra interaction to navigate results, compared to a list

Slower approach: evaluating results takes more time

Which approach is best aligned with viewer objectives?

Evaluating all three iterations across different user goals

To ensure the design decision aligns with user preferences, I evaluated all three approaches across criteria derived from insights gathered during user interviews

Primary user goal: Speed and convenience of information retrieval

Approach 1 scores highest across Scannability and Ease of Access, making it most aligned towards the user goal of speed

While detailed information can be helpful, it slows retrieval and increases cognitive load, making information density secondary to users priority of effortless navigation.

Feedback on the designs revealed users had specific preferences

I reviewed the iterations with users I interviewed, to understand what was effective and what wasn’t:

Users preferred solutions that gave them flexibility to explore different views

They allowed them to adapt the view based on their immediate needs and decision-making context.

Some comparisons in these datasets are made frequently, while others add only limited value

Not all comparisons are important for the decisions users usually need to make.

I also evaluated each iteration across three criteria: decision support, visual complexity and space efficiency:

Optimal solution: Combining decision support of overlay view + interface clarity of tabs approach

Accounting for edge cases in implementation

What if there are too many thematically related results?

Solution: Introducing a ‘Search Scope’ Slider

A user-adjustable control that lets users decide how strict or broad the semantic matching should be

To wrap up, here are a few reflections

AI driven habits are reshaping user expectations and their consumption patterns

As AI tools become more intuitive, users increasingly expect experiences that are immediate, relevant, and context-aware

Exploring speculative solutions sets the foundation of what future experiences can look like once the tech matures

Iterating without tech constraints helps establish the ideal experience, which can be adjusted based on feasibility & trade offs

© 2025 by Vishruth

© 2025 by Vishruth

© 2025 by Vishruth