#vision-language — TECH Dashboard

NEW paper research 8h ago · arxiv-cs-cl

視覚言語モデルにおけるソースモダリティ監視 Source-Modality Monitoring in Vision-Language Models

AI要約視覚言語モデルが情報の出所(画像かテキストか)をどの程度区別できるかを検証した研究。モデル内部表現を解析し、モダリティ起源の追跡能力やその限界を明らかにし、幻覚や誤帰属の抑制に向けた示唆を提示する。

EN This paper investigates whether vision-language models can monitor the source modality (image vs. text) of information they process, analyzing internal representations to reveal the models' ability and limits in tracking modality provenance, with implications for hallucination mitigation.

#arxiv #paper #vision-language #multimodal

arxiv.org →

#vision-language page 1/1 · 1 total

視覚言語モデルにおけるソースモダリティ監視 Source-Modality Monitoring in Vision-Language Models