When I wrote a provocatively-titled post about AI replacing podcasters, I caught the attention of readers and viewers. (Spoiler: if you are a podcaster or video pundit, your role is safe for now. But, expect a lot of AI audio and video to flood social media and the Web.)
Some wanted to know exactly how I created my example video with two AI “hosts” discussing my book Friction. So, I’ll do a short tutorial here. Short, because creating these audio clips is surprisingly simple.
NotebookLM’s Audio Overviews
NotebookLM is a Google AI experiment that lets you upload content – PDFs, text files, websites, YouTube video links, copied text, etc. to a “notebook.” Each notebook is, in essence, a folder that can contain up to 50 content items. This is all the information NotebookLM works from, which makes it far less prone to making up facts, quotes, etc. You can ask it questions about the uploads, have it write different kinds of content, create a study guide, etc.
The NotebookLM feature that has gotten the most attention of late is its ability to create “audio overviews.” These audio summaries aren’t robotic recitations of facts. Rather they are a podcast-like conversation between male and female voices. They are remarkably conversational and human sounding. They hesitate. They interrupt each other. They use ummms, likes and you knows. It’s a bit eerie.
One use is reducing lengthy or complex content to something easily consumable while you drive, walk, or work out. Imagine a couple of clever people reading three long research papers on a topic and having a ten minute conversation about the key findings, methods, etc.
How To Create a NotebookLM Audio Overview
Choose Your Content
Decide what content you would like to summarize. You can use just about any kind of content – documents, websites, YouTube links, etc. Keep in mind that you’ll get one summary, so if you want a “podcast” on a particular theme, the content should directly relate to that theme.
Sign In With Your Google ID
Use a web browser to go to NotebookLM. You’ll have to sign in with your Google ID if you aren’t already. If it’s your first visit, you’ll see several sample notebooks and a prominent “Create” button. You can explore their examples, but for this tutorial just click “Create.”
Add Your Sources
There are a variety of ways to add content to your new notebook. After you click “Create” you’ll see them:
- Drag or Choose Files. You can add files like PDFs (from single page to an entire book), or other fileypes like text (.txt), Markdown, or Audio (mp3, etc.) You can’t currently use Word docs or Excel sheets.
- Add From Google Drive. Either Google Docs or Google Slides will work.
- Websites and Videos. You can add a website or a specific web page. Or, you can add a link to a YouTube video.
- Text. You can add text by pasting it from your clipboard or just typing it in.
Repeat the process to add more sources. You can add up to 50, but most likely one to five will serve most purposes.
Create Your Audio Overview
Once your content is uploaded, you’ll see the “Notebook Guide” screen. It provides a short summary of your sources and offers some options like creating an FAQ list, a study guide, etc. It also has some suggested questions and a text box where you can tell it what you want – an answer to a question, a LinkedIn post, a detailed article, etc.
In this case, I uploaded two PDFs, each about 30 pages, on trends in the speaking industry in 2023 and 2024. It’s clever – unprompted, its first suggested question was, “How has the speaking industry changed from 2023 to 2024?”
But, today we want to create an audio overview. So, we have just one thing to do – click the “Generate” button. Generation takes a few minutes, you’ll be warned:
Play, Share or Download Your Audio Overview
Once your audio overview is complete, you’ll see a player appear:
In this case, the audio summary for the two lengthy PDFs runs just under ten minutes. And, it’s not bad. With no instruction, NotebookLM noticed that the reports covered similar data for two years. So, the hosts start by zeroing in on the jump in preference for in-person events over virtual.
Next, they discuss what’s hot in 2024. Amusingly, when the male host gets to the third hot topic, AI, he says it has him “concerned about his job security.”
If you like the results you can share them by copying a link, as I just did above. You can download the audio file and embed it on a website, combine it with other audio or video, etc.
Here’s a video tutorial of the same process I described above:
Limitations of NotebookLM Audio Overviews
NotebookLM can produce jaw-droppingly realistic audio, but it does have some limitations. You can’t control the number or gender of the speakers. At the moment, it’s English only. You can’t control the length of the audio – most seem to be more than five minutes but less than 15 minutes. The audio is all one track, so matching with video avatars is time-consuming.
And, as realistic as the manner of speech is, there are some AI tells. Nearly every overview seems to involve “a deep dive,” “delving into,” or some combination of those. The sample I generated for this article dove so deep there the host mentioned “bubbles.” That’s a deep dive!
The hosts can also be overly enthusiastic, finding the most mundane things to be incredibly exciting and important. That might work in some scenarios – talking about your new product, for example – but seem weird in others.
Guiding the Audio Overview – Nope!
As an experiment, I created a new notebook with the same two PDFs and added some pasted text:
“Audio Overview Instructions. Keep the audio overview short, less than 6 minutes. Do not mention diving, deep dive, delving, delve into and related terms. Recently some corporations have scaled back DEI initiatives, so focus on DEI’s popularity as a speaking topic. If you have time, also discuss the outlook for marketing speakers.”
I wanted to see if I could influence the overview’s length and content.
It was a failure I found the instructions had no apparent impact on the final audio. The new audio wasn’t shorter, it actually ran over 12 minutes. And, it ignored the instruction to focus first on specific topics. It did avoid “diving” and “delving,” but that could be coincidence.
I hope Google adds the ability to control more aspects of the audio overviews. A single speaker option and split audio tracks for two speakers would be a good start. Video avatars would increase the channels where content could be shared. More controls will change these NotebookLM audio from a novelty into a useful tool.
Don’t Forget The Rest Of NotebookLM
Audio overviews are getting a lot of attention, but there are more useful functions built into NotebookLM. When it answers your questions or gets quotes from text, it actually shows where the information came from. It’s much better at extracting accurate quotes, for example, than ChatGPT or Claude.
Come to NotebookLM for the “podcasts,” stay for the study guides, FAQs, accurate summaries, spot-on quotations, and more.