Google announced earlier that its AI assistant Gemini has officially integrated the latest generation music generation model developed by the DeepMind team.Lyria 3This means that users no longer need any professional music theory knowledge. With just simple text commands, a photo, or even a video, Gemini can "sing" a 30-second high-fidelity music clip directly in the chat box.
Not only do they write songs, they also prepare the lyrics and album covers for you.
In the past, creating music required a complex process involving arrangement, lyrics writing, and recording. However, with the Lyria 3, Gemini has significantly lowered the barrier to entry for music creation.
According to Google's demonstration, users can simply enter colloquial prompts such as "a funny R&B ballad about finding your other half with socks," and the system can produce music of a fairly high quality.
If you have more specific requirements for the music, Lyria 3 also supports more detailed granular control. For example, you can explicitly request to change the tempo of a specific section, adjust the drum style, or change the overall mood of the music.
What's even more impressive is that Gemini's music generation isn't limited to "text." Users can upload a sunset photo or a short video clip, allowing the AI to generate matching background music based on the visual atmosphere. Once the song is generated, the system will automatically call upon Google's Nano Banana image model to generate a custom album cover for the song, making the entire creative experience more complete.
SynthID integrates YouTube Shorts and uses watermarking to prevent misuse.
The application of this technology is not limited to the Gemini web version. Google has also announced that Lyria 3 will be integrated into YouTube's "Dream Track" feature, allowing creators to quickly generate highly detailed background music for their short videos.
Of course, the most sensitive issues regarding AI-generated music are copyright disputes and authenticity.
To prevent AI-generated music from being maliciously misused or impersonated as real human creations, all 30-second audio tracks generated by Lyria 3 will have Google's SynthID digital watermark forcibly embedded at the underlying level. This watermark is imperceptible to the human ear, but it can be easily identified whether the audio was generated by a machine using the SynthID Detector tool that Google introduced at last year's Google I/O developer conference.
First impressions: The accompaniment is amazing, but there is still room for improvement in "human-written lyrics".
According to initial feedback from foreign media's actual tests, the Lyria 3 performs exceptionally well in the "Instrumental" category, producing melodies with a high degree of layering and realism. However, when it comes to "Lyrics and Vocals" automatically generated by AI, the current performance sometimes still sounds somewhat cheesy or has an unnatural, mechanical feel. This is an area that users may need to repeatedly try and adjust during actual operation.
This music generation feature is now being rolled out to Gemini users aged 18 and over worldwide, starting today. Initially, it supports eight languages: English, Spanish, German, French, Hindi, Japanese, Korean, and Portuguese.



