Using semantic search to help viewers navigate long-form videos faster
Using semantic search to help viewers navigate long-form videos faster
Independent UX study, 2025
Independent UX study, 2025

About this case study
About this case study
This case study explores how video semantic search can be integrated into streaming platforms like YouTube to help users find information within videos more quickly.
This case study explores how video semantic search can be integrated into streaming platforms like YouTube to help users find information within videos more quickly.

HOW THINGS BEGAN
HOW THINGS BEGAN
Ever spent time scrubbing through a video just to find that one part you were searching for?
Ever spent time scrubbing through a video just to find that one part you were searching for?
My observation: users struggle to locate specific information within long-form videos
My observation: users struggle to locate specific information within long-form videos
A lecture on YouTube had what I needed, but I struggled to locate the exact section, scrolling back and forth just to find the right part.
A lecture on YouTube had what I needed, but I struggled to locate the exact section, scrolling back and forth just to find the right part.

Features to improve video navigation exist, but are limited in precision & accessibility
Features to improve video navigation exist, but are limited in precision & accessibility
With the growth of podcasts and other long-form content, YouTube introduced new video player features to improve content navigation. But these still come with limitations.
With the growth of podcasts and other long-form content, YouTube introduced new video player features to improve content navigation. But these still come with limitations.
Method 1: Using Timestamps and Chapters to find relevant sections
Method 1: Using Timestamps and Chapters to find relevant sections

How does this feature fall short?
How does this feature fall short?
Requires significant effort from publishers to implement effectively, if at all
Requires significant effort from publishers to implement effectively, if at all
Beginners or smaller channels often skip this step or create imprecise timestamps because of the work involved
Beginners or smaller channels often skip this step or create imprecise timestamps because of the work involved
Even with timestamps, sections are broad (10–15 minutes) and lack micro-level granularity
Even with timestamps, sections are broad (10–15 minutes) and lack micro-level granularity
When poorly implemented by publishers, users are still left scrubbing through large sections
When poorly implemented by publishers, users are still left scrubbing through large sections
Second Way: Transcript based search
Second Way: Transcript based search

Users fail to find relevant moments if they don’t match the transcript’s exact wording
Users fail to find relevant moments if they don’t match the transcript’s exact wording
Only matches exact phrases, failing to account for conceptually related content
Only matches exact phrases, failing to account for conceptually related content
How do users approach information search? I interviewed them to find out
How do users approach information search? I interviewed them to find out
Affinity mapping user insights to identify behavior patterns in viewers:
Affinity mapping user insights to identify behavior patterns in viewers:
A PATTERN THAT STOOD OUT
Habituation to AI search has lowered user’s tolerance for manual, slow information retrieval.
Habituation to AI search has lowered user’s tolerance for manual, slow information retrieval.
I use Notebook LLM to search within the subtitles, or ask Perplexity to open the video to the relevant part
I use Notebook LLM to search within the subtitles, or ask Perplexity to open the video to the relevant part
Users leverage AI tools to bypass manual video navigation, prioritizing speed in finding specific data
Users leverage AI tools to bypass manual video navigation, prioritizing speed in finding specific data
If it takes too long to find the part that answers my doubt, I skip the video and ask ChatGPT instead
If it takes too long to find the part that answers my doubt, I skip the video and ask ChatGPT instead
Users don’t patience for slow information retrieval and will prioritize faster alternatives (instant gratification)
Users don’t patience for slow information retrieval and will prioritize faster alternatives (instant gratification)
AI-powered tools have recalibrated what users consider “acceptable speed” for content discovery
Instant search results are becoming a baseline expectation, and activities requiring sustained attention are increasingly seen as frustrating.
AI-powered tools have recalibrated what users consider “acceptable speed” for content discovery
Instant search results are becoming a baseline expectation, and activities requiring sustained attention are increasingly seen as frustrating.
OPPORTUNITY AREA
OPPORTUNITY AREA
Users need a quicker and more context-aware method to efficiently locate specific information within long videos
Users need a quicker and more context-aware method to efficiently locate specific information within long videos
Idea: Using video semantic search, to help users find relevant sections within content
Idea: Using video semantic search, to help users find relevant sections within content
Video semantic search allows for context-aware retrieval of video segments based on natural language queries.
I first learned about it when Tyler Angert, founder of Patina Systems, noted that the technology is being developed, but it remains largely unexplored in products.
Video semantic search allows for context-aware retrieval of video segments based on natural language queries.
I first learned about it when Tyler Angert, founder of Patina Systems, noted that the technology is being developed, but it remains largely unexplored in products.

Example of Semantic Search implemented across Amazon Web Services for faster retrieval and search of videos in large databases.
Example of Semantic Search implemented across Amazon Web Services for faster retrieval and search of videos in large databases.
While the tech is still evolving, it can guide the design of an ideal video experience as user expectations grow in an AI world
While the tech is still evolving, it can guide the design of an ideal video experience as user expectations grow in an AI world
How can we incorporate it in streaming platforms like YouTube?
How can we incorporate it in streaming platforms like YouTube?
Approach 1: Video search card within the video player
Approach 1: Video search card within the video player
In-player overlay that allows user to search using keywords, which highlights relevant video segments on the timeline
In-player overlay that allows user to search using keywords, which highlights relevant video segments on the timeline
A floating overlay with a search field lets users enter a query, post which all relevant sections get highlighted in the video progress bar.
A floating overlay with a search field lets users enter a query, post which all relevant sections get highlighted in the video progress bar.

PROS
PROS
Keeps search within the video frame, user need not exit
Keeps search within the video frame, user need not exit
Timeline highlights give quick visual cues of location
Timeline highlights give quick visual cues of location
Allows for fast navigation and search
Allows for fast navigation and search
CONS
CONS
If there are too many results, will clutter the progress bar
If there are too many results, will clutter the progress bar
Limited info is displayed about the search results
Limited info is displayed about the search results
Approach 2: Side panel outside the video player frame
Approach 2: Side panel outside the video player frame
A side panel lists all relevant sections and timestamps based on the user’s query
A side panel lists all relevant sections and timestamps based on the user’s query
The search field trigger can be accessed from the video’s description panel, alongside other similar features
The search field trigger can be accessed from the video’s description panel, alongside other similar features

PROS
PROS
Same UX pattern of similar features (transcripts, chapters)
Same UX pattern of similar features (transcripts, chapters)
Easier to browse through a list of results
Easier to browse through a list of results
CONS
CONS
Discoverability friction, feature is nested in another panel
Discoverability friction, feature is nested in another panel
Extra interaction: user needs to exit the player frame
Extra interaction: user needs to exit the player frame
Approach 3: Paginated side panel with sections and their transcripts
Approach 3: Paginated side panel with sections and their transcripts
Emphasizes depth in search results, showing transcript excerpts with timestamps so users can scan text
Emphasizes depth in search results, showing transcript excerpts with timestamps so users can scan text

PROS
PROS
Detailed info about search results before viewing them
Detailed info about search results before viewing them
CONS
CONS
Extra interaction to navigate results, compared to a list
Extra interaction to navigate results, compared to a list
Slower approach: evaluating results takes more time
Slower approach: evaluating results takes more time
Which approach is best aligned with viewer objectives?
Which approach is best aligned with viewer objectives?
Evaluating all three iterations across different user goals
Evaluating all three iterations across different user goals
To ensure the design decision aligns with user preferences, I evaluated all three approaches across criteria derived from insights gathered during user interviews
To ensure the design decision aligns with user preferences, I evaluated all three approaches across criteria derived from insights gathered during user interviews
Primary user goal: Speed and convenience of information retrieval
Primary user goal: Speed and convenience of information retrieval
Approach 1 scores highest across Scannability and Ease of Access, making it most aligned towards the user goal of speed
Approach 1 scores highest across Scannability and Ease of Access, making it most aligned towards the user goal of speed
While detailed information can be helpful, it slows retrieval and increases cognitive load, making information density secondary to users priority of effortless navigation.
While detailed information can be helpful, it slows retrieval and increases cognitive load, making information density secondary to users priority of effortless navigation.
Accounting for edge cases in implementation
Accounting for edge cases in implementation
What if there are too many thematically related results?
What if there are too many thematically related results?

Solution: Introducing a ‘Search Scope’ Slider
Solution: Introducing a ‘Search Scope’ Slider
A user-adjustable control that lets users decide how strict or broad the semantic matching should be
A user-adjustable control that lets users decide how strict or broad the semantic matching should be

To wrap up, here are a few reflections
To wrap up, here are a few reflections
AI driven habits are reshaping user expectations and their consumption patterns
AI driven habits are reshaping user expectations and their consumption patterns
As AI tools become more intuitive, users increasingly expect experiences that are immediate, relevant, and context-aware
As AI tools become more intuitive, users increasingly expect experiences that are immediate, relevant, and context-aware
Exploring speculative solutions sets the foundation of what future experiences can look like once the tech matures
Exploring speculative solutions sets the foundation of what future experiences can look like once the tech matures
Iterating without tech constraints helps establish the ideal experience, which can be adjusted based on feasibility & trade offs
Iterating without tech constraints helps establish the ideal experience, which can be adjusted based on feasibility & trade offs
Using semantic search to help viewers navigate long-form videos faster
Independent UX Study, 2025


About this case study
This case study explores how video semantic search can be integrated into streaming platforms like YouTube to help users find information within videos more quickly.
HOW THINGS BEGAN
Ever spent time scrubbing through a video just to find that one part you were searching for?


My observation: users struggle to locate specific information within long-form videos
A lecture on YouTube had what I needed, but I struggled to locate the exact section, scrolling back and forth just to find the right part.


Features to improve video navigation exist, but are limited in precision & accessibility
With the growth of podcasts and other long-form content, YouTube introduced new video player features to improve content navigation. But these still come with limitations.
Method 1: Using Timestamps and Chapters to find relevant sections


How does this feature fall short?
Tools don’t link data with contextual key indicators needed to take action
Ex: Vesting charts track equity over time, but exclude scenario-specific factors such as expiry dates or no-exercise windows.
Users miss how interconnected factors shape outcomes. What-if tools focus on only single metrics
Scenario modelling tools let you tweak numbers separately, but not judge the combined impact of different variables.
Second Way: Transcript based search


Users fail to find relevant moments if they don’t match the transcript’s exact wording
Only matches exact phrases, failing to account for conceptually related content
How do users approach information search? I interviewed them to find out
Affinity mapping user insights to identify behavior patterns in viewers:
A PATTERN THAT STOOD OUT
Habituation to AI search has lowered user’s tolerance for manual, slow information retrieval.
I use Notebook LLM to search within the subtitles, or ask Perplexity to open the video to the relevant part
Users leverage AI tools to bypass manual video navigation, prioritizing speed in finding specific data
If it takes too long to find the part that answers my doubt, I skip the video and ask ChatGPT instead
Users don’t patience for slow information retrieval and will prioritize faster alternatives (instant gratification)
OPPORTUNITY AREA
Users need a quicker and more context-aware method to efficiently locate specific information within long videos
Idea: Using video semantic search, to help users find relevant sections within content
Video semantic search allows for context-aware retrieval of video segments based on natural language queries.
I first learned about it when Tyler Angert, founder of Patina Systems, noted that the technology is being developed, but it remains largely unexplored in products.:


Example of Semantic Search implemented across Amazon Web Services for faster retrieval and search of videos in large databases.
While the tech is still evolving, it can guide the design of an ideal video experience as user expectations grow in an AI world
How can we incorporate it in streaming platforms like YouTube?
Approach 1: Video search card within the video player
In-player overlay that allows user to search using keywords, which highlights relevant video segments on the timeline
A floating overlay with a search field lets users enter a query, post which all relevant sections get highlighted in the video progress bar.


PROS
Keeps search within the video frame, user need not exit
Timeline highlights give quick visual cues of location
Allows for fast navigation and search
CONS
If there are too many results, will clutter the progress bar
Limited info is displayed about the search results
Approach 2: Side panel outside the video player frame
A side panel lists all relevant sections and timestamps based on the user’s query
The search field trigger can be accessed from the video’s description panel, alongside other similar features


PROS
Same UX pattern of similar features (transcripts, chapters)
Easier to browse through a list of results
CONS
Discoverability friction, feature is nested in another panel
Extra interaction: user needs to exit the player frame
Approach 3: Paginated side panel with sections and their transcripts
Emphasizes depth in search results, showing transcript excerpts with timestamps so users can scan text


PROS
Detailed info about search results before viewing them
CONS
Extra interaction to navigate results, compared to a list
Slower approach: evaluating results takes more time
Which approach is best aligned with viewer objectives?
Evaluating all three iterations across different user goals
To ensure the design decision aligns with user preferences, I evaluated all three approaches across criteria derived from insights gathered during user interviews
Primary user goal: Speed and convenience of information retrieval
Approach 1 scores highest across Scannability and Ease of Access, making it most aligned towards the user goal of speed
While detailed information can be helpful, it slows retrieval and increases cognitive load, making information density secondary to users priority of effortless navigation.
Feedback on the designs revealed users had specific preferences


I reviewed the iterations with users I interviewed, to understand what was effective and what wasn’t:
Users preferred solutions that gave them flexibility to explore different views
They allowed them to adapt the view based on their immediate needs and decision-making context.
Some comparisons in these datasets are made frequently, while others add only limited value
Not all comparisons are important for the decisions users usually need to make.
I also evaluated each iteration across three criteria: decision support, visual complexity and space efficiency:


Optimal solution: Combining decision support of overlay view + interface clarity of tabs approach
Accounting for edge cases in implementation
What if there are too many thematically related results?


Solution: Introducing a ‘Search Scope’ Slider
A user-adjustable control that lets users decide how strict or broad the semantic matching should be


To wrap up, here are a few reflections
AI driven habits are reshaping user expectations and their consumption patterns
As AI tools become more intuitive, users increasingly expect experiences that are immediate, relevant, and context-aware
Exploring speculative solutions sets the foundation of what future experiences can look like once the tech matures
Iterating without tech constraints helps establish the ideal experience, which can be adjusted based on feasibility & trade offs

