Read on to learn about my Capstone project for Google’s Gen AI Intensive Course 2025
Frequently when researching a design or writing an article I want to find a specific piece of information. If I know the information is in a book then I simply open the book’s index and look for a relevant word that will lead me to that piece of information.
However when the desired information is in a video (or other linear media such as podcasts) then I need to try one of the following things:
All of these methods are inefficient. I don’t even know if the desired information is there at all, so skimming through videos or downloading transcripts is a cumbersome and resource intensive task.
In defining this problem I’m not just conducting me-search. Everyone has to deal with this issue when working with linear media. It’s in the nature of the media - you can’t experience video or audio without spending time. Jumping from one place to another (which we can do so easily with text and images) is extremely difficult with linear media. Even if you do jump from one place to another, you then have to hit play and watch and/or listen to understand. It takes time - much longer than scanning an alphabetised book index.
Students and researchers can benefit from a tool that solves this. It should also appeal to anyone who is simply looking to quickly find something within a video - whether it’s a DIYer trying to fix a plumbing emergency, a parent who has been asked by their kid what a dog’s backward sneeze is, or anyone else! Their chances of finding relevant information would be considerably improved with a video index.
So with that problem in mind, the use case I’m attempting to solve is creating book-style indexes for YouTube videos.
Recent advances in generative AI are extremely useful for solving this use case:
When I had defined my use case and considered what was possible with GenAI, I felt confident that I could create a tool.