#text-to-speech — TECH Dashboard

Entries page 1/1 · 3 total

Fri, May 29 1 entries

paper research 3w ago ·

arxiv-cs-cl

プロンプトベースTTSモデルにおける細粒度・発話内話し方スタイル制御の実現 Unlocking Fine-Grained and Within-Utterance Speaking Style Control in Prompt-Based Text-to-Speech Models

重要度 Medium Medium priority 重要度 Medium · 論文/研究 · Papers / Benchmarks Medium priority · paper/research · Papers / Benchmarks 公開 5月29日 Published May 29

AI要約自然言語プロンプトで音声合成のスタイルを制御するTTSモデルで、発話内の細粒度かつ動的なスタイル制御を可能にする手法を提案。

EN arXiv:2605.27376v1 Announce Type: new Abstract: While prompt-based text-to-speech (TTS) models enable natural language-driven speaking style control, they often provide limited fine-grained control an

#arxiv #paper #text-to-speech +5

arxiv.org →

og fallback

Thu, May 7 1 entries

blog codex 1mo ago ·

openai-blog

OpenAI、APIに新音声モデルを追加し音声AIを強化 Advancing voice intelligence with new models in the API

重要度 Medium Medium priority 重要度 Medium · 技術記事 · OpenAI / Codex Medium priority · technical post · OpenAI / Codex 公開 5月7日 Published May 7

AI要約 OpenAIはAPI経由で利用できる新しい音声モデル群を発表し、音声AIの性能を向上させた。より自然な発話、低レイテンシ、堅牢な認識を実現し、開発者が音声エージェントや対話アプリを構築しやすくなる。

EN Explore new realtime voice models in the OpenAI API that can reason, translate, and transcribe speech, enabling more natural and intelligent voice experiences.

#openai #voice-ai #speech-to-text +3

openai.com →

fallback

Thu, Apr 16 1 entries

NEW blog gemini 2mo ago ·

google-deepmind

Gemini 3.1 Flash TTS、表現力豊かな次世代AI音声を実現 Gemini 3.1 Flash TTS: the next generation of expressive AI speech

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Gemini / Gemma Medium priority · technical post · Gemini / Gemma 公開 4月16日 Published Apr 16

AI要約 Google DeepMindは表現力に優れた次世代の音声合成モデル「Gemini 3.1 Flash TTS」を発表した。自然なイントネーションや感情表現を備え、低レイテンシかつ多言語対応で、開発者向けにAPIを通じて提供される。

EN Our newest audio model introduces granular audio tags that give you precise control to direct AI speech for expressive audio generation.

#deepmind #google #text-to-speech +3

deepmind.google →

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

media fallback

#text-to-speech 3 total

Entries page 1/1 · 3 total

プロンプトベースTTSモデルにおける細粒度・発話内話し方スタイル制御の実現 Unlocking Fine-Grained and Within-Utterance Speaking Style Control in Prompt-Based Text-to-Speech Models

OpenAI、APIに新音声モデルを追加し音声AIを強化 Advancing voice intelligence with new models in the API

Gemini 3.1 Flash TTS、表現力豊かな次世代AI音声を実現 Gemini 3.1 Flash TTS: the next generation of expressive AI speech