Papers / Benchmarks ⚠ 古い情報の可能性

ICG: MLLMベースのプロンプティングとパーソナライズド選好アライメントによるカバー画像生成の改善 ICG: Improving Cover Image Generation via MLLM-based Prompting and Personalized Preference Alignment

arXiv cs.CL · arxiv.org · 2026/05/29 13:00 · 3w ago · 📖 1 min

元記事を読む古い情報の可能性

AI 3 行サマリ

MLLMと拡散モデルを組み合わせ、記事や動画のカバー画像生成をユーザー好みに合わせてパーソナライズする手法ICGを提案。

English summary

arXiv:2605.27374v1 Announce Type: new Abstract: Recent advances in multimodal large language models (MLLMs) and diffusion models (DMs) have opened new possibilities for AI-generated content.
Yet, pers

本論文はarXiv（2605.27374）で公開された研究で、マルチモーダル大規模言語モデル（MLLM）と拡散モデル（DM）を組み合わせてカバー画像生成を改善するフレームワーク「ICG」を提案しています。MLLMをプロンプト生成器として活用し、コンテンツに適した画像生成指示を自動構築する点が特徴です。

さらに、ユーザーごとの視覚的選好を反映するパーソナライズドアライメント機構を導入することで、汎用的な画像生成にとどまらず個人の好みに沿った出力を目指しています。詳細な実験設定や定量評価については原文をご確認ください。

This paper, arXiv:2605.27374, introduces ICG — a framework that leverages multimodal large language models (MLLMs) as structured prompt generators to guide diffusion models in producing cover images for articles or videos. By having the MLLM interpret content semantics and formulate generation prompts, the system aims to produce more contextually relevant visuals than generic text-to-image pipelines.

A key contribution is personalized preference alignment, which tailors outputs to individual user aesthetic preferences rather than optimizing for a single global objective. This positions ICG at the intersection of MLLM-based reasoning and preference-aware generative modeling. Specific benchmark results, datasets, and implementation details should be verified at the source paper.

#arxiv #paper #image-generation #multimodal-llm #diffusion-models #personalization #preference-alignment

SourcearXiv cs.CLT1
Source Avg ★ 2.0
Type論文
Importance ★ 通常 (top 93% in Papers / Benchmarks)
Half-life 🏛️ 長期 (アーキテクチャ)
LangEN
Collected2026/05/30 07:00

元記事を読む

arxiv.org

本ページの本文・要約は AI による自動生成です。正確性は元記事 (arxiv.org) をご確認ください。

🔬 Papers / Benchmarks の他の記事 もっと見る →

🔬 Papers / Benchmarks の他の記事もっと見る →