Geminiに音楽生成機能、Lyriaでオリジナル曲を作成可能に A new way to express yourself: Gemini can now create music

Google DeepMind Blog · deepmind.google · 2026/02/19 01:01 · 4mo ago · 📖 2 min

AI 3 行サマリ

GoogleはGeminiアプリに音楽生成機能を追加した。
Lyriaモデルを基盤に、テキストプロンプトからインストゥルメンタル曲やボーカル付きの楽曲を最大数十秒生成できる。
SynthIDによる電子透かしも付与され、生成物の識別が可能となる。

English summary

The Gemini app now features our most advanced music generation model Lyria 3, empowering anyone to make 30-second tracks using text or images.

Googleは、Geminiアプリにテキストプロンプトから音楽を生成する新機能を追加した。Google DeepMindの音楽生成モデル「Lyria」を基盤としており、ユーザーは自然言語で雰囲気やジャンル、楽器構成などを指示することで、インストゥルメンタル曲やボーカル付きの楽曲を生成できる。

生成された楽曲には、Google DeepMindが開発した電子透かし技術「SynthID」が埋め込まれる。これにより、人間の耳には判別できない形で音声内に識別子が組み込まれ、後からAI生成コンテンツであることを検証できる仕組みになっている。誤情報や著作権侵害が懸念される生成AI領域において、こうした透明性確保の取り組みは重要度を増している。

背景として、音楽生成AIは近年競争が激化している分野である。SunoやUdioといったスタートアップが歌詞付き楽曲生成で先行し、Meta(MusicGen)、Stability AI(Stable Audio)なども独自モデルを公開してきた。GoogleはLyriaをすでにYouTube ShortsのDream Trackや、音楽家向けの実験的ツールMusicFXに組み込んでおり、今回のGemini統合は同モデルを一般消費者向けに本格展開する位置づけと見られる。

Lyriaモデルを基盤に、テキストプロンプトからインストゥルメンタル曲やボーカル付きの楽曲を最大数十秒生成できる。

✨ Gemini / Gemma · 本記事のポイント

また、テキストから音声・音楽を生成するモダリティ拡張は、Geminiが画像(Imagen)や動画(Veo)に続いて音声領域へも本格進出することを意味する。マルチモーダル対話型アシスタントとしての完成度を高める狙いがあると考えられる。一方で、訓練データの権利処理やアーティストとの関係構築は引き続き業界全体の課題であり、Googleもレーベルや音楽家との協業を模索してきた経緯がある。今回の機能拡張は、創作支援ツールとしての利用価値とともに、生成AIと音楽産業の関係性を改めて問う動きにもなりそうだ。

Google has rolled out a new music generation capability inside the Gemini app, allowing users to compose original tracks from natural-language prompts. The feature is powered by Lyria, Google DeepMind's dedicated music generation model, and can produce both instrumental pieces and songs with vocals based on descriptions of mood, genre, instrumentation, or lyrical themes.

Under the hood, Lyria generates audio that sounds polished enough for casual creative use, while every output is embedded with SynthID, DeepMind's audio watermarking technology. SynthID inserts an inaudible signal into the waveform that can later be detected to verify the clip was AI-generated. As concerns mount around deepfakes, voice cloning, and copyright disputes in generative audio, this kind of provenance signal is becoming a baseline expectation rather than a differentiator.

The move places Gemini squarely in an increasingly crowded music-AI landscape. Startups such as Suno and Udio have built sizable user bases around text-to-song generation, while Meta's MusicGen and Stability AI's Stable Audio have pushed open or semi-open alternatives. Google itself has been seeding Lyria into adjacent products for some time, including the Dream Track experiment on YouTube Shorts and the MusicFX tool aimed at musicians. Bringing the model into Gemini appears to be the company's first concerted attempt to expose Lyria to a mainstream consumer audience inside its flagship assistant.

The addition also marks another step in Gemini's expansion across modalities. After integrating Imagen for images and Veo for video, audio and music are arguably the last major creative domain to be plugged directly into the chat interface. That progression fits a broader strategy of turning Gemini into a single multimodal surface where users can describe an idea and receive text, picture, clip, or track without switching tools — a vision that competitors such as OpenAI and Microsoft are pursuing with their own stacks.

Questions about training data and artist consent remain unresolved across the industry, and Google has previously emphasized partnerships with labels and individual musicians when piloting Lyria-based experiences. Whether the consumer rollout includes similar guardrails on style mimicry or vocal likenesses has not been spelled out in detail, though SynthID watermarking and Google's standard usage policies are likely to apply. For everyday users, the practical appeal is straightforward: a quick way to spin up background music, jingles, or playful song ideas. For the wider music ecosystem, it is another reminder that generative tools are moving from specialist apps into the default interfaces hundreds of millions of people already use.