#tts — TECH Dashboard

Entries page 1/1 · 3 total

Fri, May 29 2 entries

paper research 3w ago ·

arxiv-cs-cl

プロンプトベースTTSモデルにおける細粒度・発話内話し方スタイル制御の実現 Unlocking Fine-Grained and Within-Utterance Speaking Style Control in Prompt-Based Text-to-Speech Models

重要度 Medium Medium priority 重要度 Medium · 論文/研究 · Papers / Benchmarks Medium priority · paper/research · Papers / Benchmarks 公開 5月29日 Published May 29

AI要約自然言語プロンプトで音声合成のスタイルを制御するTTSモデルで、発話内の細粒度かつ動的なスタイル制御を可能にする手法を提案。

EN arXiv:2605.27376v1 Announce Type: new Abstract: While prompt-based text-to-speech (TTS) models enable natural language-driven speaking style control, they often provide limited fine-grained control an

#arxiv #paper #text-to-speech +5

arxiv.org →

og fallback

paper research 3w ago ·

arxiv-cs-cl

安定性と表現力のギャップを埋める：低リソース音声言語モデルのための合成データスケーリングと選好アライメント Bridging the Stability-Expressivity Gap: Synthetic Data Scaling and Preference Alignment for Low-Resource Spoken Language Models

重要度 Medium Medium priority 重要度 Medium · 論文/研究 · Papers / Benchmarks Medium priority · paper/research · Papers / Benchmarks 公開 5月29日 Published May 29

AI要約低リソース環境の音声言語モデルにおける安定性と表現力のトレードオフを、合成データのスケーリングと選好アライメントで解消する研究。

EN arXiv:2605.27383v1 Announce Type: new Abstract: Spoken Language Models (SLMs) have emerged as a promising paradigm for speech synthesis by bypassing explicit grapheme-to-phoneme pipelines. However, th

#arxiv #paper #spoken-language-model +5

arxiv.org →

og fallback

Sat, Dec 13 1 entries

NEW blog gemini 6mo ago ·

google-deepmind

Google DeepMind、Gemini音声モデルを刷新し高品質な音声体験を実現 Improved Gemini audio models for powerful voice experiences

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Gemini / Gemma Medium priority · technical post · Gemini / Gemma 公開 12月13日 Published Dec 13

AI要約 Google DeepMindはGemini APIとVertex AI向けに改良された音声モデルを発表した。新たなネイティブ音声対話、TTS、音声認識(ASR)機能を提供し、より自然で表現豊かな会話体験を可能にする。エンタープライズ向け開発者が音声エージェントなどを構築できる。

EN Improved Gemini audio models for powerful voice experiences

#deepmind #google #gemini-api +4

deepmind.google →

Improved Gemini audio models for powerful voice experiences

media fallback

#tts 3 total

Entries page 1/1 · 3 total

プロンプトベースTTSモデルにおける細粒度・発話内話し方スタイル制御の実現 Unlocking Fine-Grained and Within-Utterance Speaking Style Control in Prompt-Based Text-to-Speech Models

安定性と表現力のギャップを埋める：低リソース音声言語モデルのための合成データスケーリングと選好アライメント Bridging the Stability-Expressivity Gap: Synthetic Data Scaling and Preference Alignment for Low-Resource Spoken Language Models

Google DeepMind、Gemini音声モデルを刷新し高品質な音声体験を実現 Improved Gemini audio models for powerful voice experiences