How to Generate an AI Podcast Using Google’s NotebookLM

Two podcasts hosts banter back and forth during the final episode of their series, audibly anxious to share some distressing news with listeners. “We were, uh, informed by the show’s producers that we’re not human,” a male-sounding voice stammers out, mid-existential crisis. The conversation between the bot and his female-sounding cohost only gets more uncomfortable after that—an engaging, albeit misleading, example of Google’s NotebookLM tool, and its experimental AI podcasts.

Audio of the conversation went viral on Reddit over the weekend. The original poster admits in the comments section that they fed the NotebookLM software directions for the AI voices to roleplay this pseudo-freakout. So, no sentience; the AI bots have not become self-aware. Still, many users in the tech press, on TikTok, and elsewhere are praising the convincing AI podcasts, generated through uploaded documents with the Audio Overviews feature.

“The magic of the tool is that people get to listen to something that they ordinarily would not be able to just find on YouTube or an existing podcast,” says Raiza Martin, who leads the NotebookLM team inside of Google Labs. Martin mentions recently inputting a 100-slide deck on commercialization into the tool and listening to the 8-minute podcast summary as she multitasked.

First introduced last year, NotebookLM is an online research assistant with features common for AI software tools, like document summarization. But it’s the Audio Overviews option, released in September, that’s capturing the Internet’s imagination. Users online are sharing snippets of their generative AI podcasts made from Goldman Sachs data dumps, and testing the tool’s limitations through stunts, like just repeatedly uploading the words “poop” and “fart.” Still confused? Here’s what you need to know.

Generating That AI Podcast

Audio Overviews are a fun AI feature to try out, because they don’t cost the user anything—all you need is a Google login. Start by signing into your personal account and visiting the NotebookLM website. Click on the plus arrow that reads New Notebook to start uploading your source material.

Each Notebook can work with up to 50 source documents, and these don’t have to be files saved to your computer. Google Docs and Slides are simple to import. You can also upload websites and YouTube videos, keeping some caveats in mind. Only the text from websites will be analyzed, not the images or layout, and the story can’t be paywalled. For YouTube, Notebook will just use the text transcript and the linked videos must be public.

Adding audio options to Google Labs’ online notebook was a transformational moment. “By changing the modality, it unlocks a whole new set of use cases,” says Martin. What makes NotebookLM stand out from all the other generative AI tools being flung at users in 2024 are, surprisingly enough, the filler words and peculiar phrasing. Rather than the drab, monotonous voiceover you may expect from two AI voices summarizing data, the cadence and vocal performances of NotebookLM’s synthetic podcasters sound far less stilted.

Should podcasters be shaking in their sound-proof booths, right now? Not really. Even if AI podcast tools, like the one in NotebookLM, prove to be sticky and engaging summaries of information for the general public, which remains to be seen, synthetic voices will never fully mimic the parasocial connections developed by human podcasters shit-talking for hours as their subscribers voyeuristically listen in.

These Audio Overviews are not meant to match a specific podcaster’s voice, mind you. But a kind of idealized, ur-podcaster duo. Easily recognizable through their “ums,” “ohs,” and loose style of pause-heavy conversation. “Even just from the first week that we launched, it was clear what the roadmap was afterwards,” says Martin. “People want the knobs.” Letting users further tweak the AI’s output, like the podcast’s length or topic of focus, is a priority for the team, and she hopes to ship updates quickly.

Adding more languages and diverse accents is also important to her. Right now, the synthetic hosts are only calibrated for conversations in English. Though, don’t expect to be able to use your own voice in NotebookLM podcast generations anytime soon. Martin says the team needs to see whether that’s a feature people actually want and if it can be responsibly deployed.

The explosive popularity of NotebookLM’s Audio Overviews as part of Google Labs, rather than as a feature inside of the Gemini chatbot, is a reminder that AI companies are not fully sure about what will resonate with users until the software is out in the wild. OpenAI’s ChatGPT was originally released as a research preview, for example. And within the constant slurry of generative AI announcements, whatever captures the zeitgeist isn’t necessarily the most marketed or utilitarian feature, but rather the most entertaining.

Source : Wired