Lyria 3: All About Google's AI-Powered Tool That Lets You Create Song From Text And Videos

Google said that the goal of Lyria 3 is not to produce studio-ready songs, but to enable fast, creative self-expression.

Advertisement
Read Time: 4 mins
Lyria 3 creates 30-second AI-generated music tracks.
Quick Read
Summary is AI-generated, newsroom-reviewed
  • The AI automatically generates lyrics and lets users control genre, tempo, mood, and vocal style
  • Lyria 3 analyses photos and videos to create music reflecting the mood or story in the input
  • Tracks include AI-generated cover art and can be shared or downloaded via the Gemini app
Did our AI summary help?
Let us know.

When Google first introduced AI tools for text, images and video, the focus was largely on productivity and visual creativity. Now, the company is pushing deeper into another frontier - music. Its latest model, Lyria 3, signals how quickly generative AI is expanding into new forms of human expression. But unlike some expectations around AI composing full albums or orchestral scores, Lyria 3 is currently designed for something more immediate - short, personalised music creation.

What Is Lyria 3?

According to Google blog, Lyria 3 is the newest generative music model from Google DeepMind, rolling out in beta inside the Gemini app. The system allows users to create 30-second music tracks simply by describing an idea in text or uploading an image or video for inspiration.

It also mentions an example prompt: "A comical R&B slow jam about a sock finding its match."

Within seconds, the AI generates a short track complete with vocals, instrumentation and lyrics.

The goal is not to produce studio-ready songs, but to enable fast, creative self-expression - something closer to musical messaging than professional composition.

Key Features Explained

Google says Lyria 3 improves significantly over earlier versions of its music models in three major ways:

1. Automatic Lyrics Generation: Users no longer need to provide lyrics. Gemini writes them automatically based on the prompt, matching the theme, tone and style requested.

2. Greater Creative Control: Users can guide multiple musical elements, including:

  • Genre (pop, R&B, afrobeat, electronic, etc.)
  • Tempo and energy level
  • Vocal style
  • Mood and narrative

This makes the interaction feel more collaborative than earlier AI music tools.

3. More Realistic and Complex Audio: Google claims improvements in musical layering and sound quality, producing tracks that feel more coherent and polished despite the short duration.

Music From Photos and Videos

One of the more distinctive capabilities is multimodal generation.

Users can upload:

  • A photo
  • A short video
  • Personal memories or references

The AI then analyses the content and creates a track with lyrics reflecting the mood or story.

For instance, uploading images of a pet hiking could result in a custom song about that experience.

This reflects a broader industry shift toward AI systems that combine multiple types of input - text, visuals and audio - into a single creative workflow.

Advertisement

Tracks generated through the Gemini app are limited to about 30 seconds and come with AI-generated cover art created by Google's Nano Banana. Users can download or share them directly through links.

Google's positioning is clear: This is meant to be fun, fast and social, not necessarily a replacement for professional music production.

Advertisement

That design choice mirrors how short-form video transformed content creation. Instead of aiming for cinematic quality, tools prioritise speed and accessibility.

Integration With YouTube Creators

Lyria 3 is also being integrated into YouTube through Dream Track, a feature that helps creators generate custom soundtracks for Shorts.

Advertisement

Initially launched in the United States and expanding to more regions, the technology allows creators to produce:

  • Short lyrical segments
  • Background music
  • Personalised audio themes

For short-form creators, music is often a critical part of engagement, and AI-generated soundtracks could reduce dependence on licensed audio libraries.

Why Lyria 3 Matters in the AI Race

While a 30-second music generator may seem modest compared to large language models, it reflects a deeper trend: AI systems are rapidly becoming multimodal creative engines.

Advertisement

Companies such as OpenAI, Google and others are competing to build platforms that can generate text, images, video and audio within a single interface. The pace of releases has accelerated dramatically, with major upgrades appearing every few months.

Three broader shifts define the current AI landscape:

  • Creativity at scale: AI moving beyond productivity into entertainment and art
  • Multimodal interaction: Combining text, visuals and audio seamlessly
  • Consumer accessibility: Advanced tools reaching everyday users, not just professionals

Music generation is particularly complex because it involves timing, structure and emotional nuance.

Lyria 3 may not yet compose full symphonies, but the speed at which these capabilities are arriving suggests one thing clearly: The AI evolution curve is still climbing.

Featured Video Of The Day
Ranveer Singh Was Asked For 10 Crores By Bishnoi Gang, Note Came From US Number
Topics mentioned in this article