#hallucination-detection — TECH Dashboard

NEW paper research 5h ago · arxiv-cs-cl

本論文HalluSAEは、スパースオートエンコーダ(SAE)を用いて大規模言語モデルの内部表現から幻覚に関連する特徴を抽出し、幻覚の検出を行… HalluSAE: Detecting Hallucinations in Large Language Models via Sparse Auto-Encoders

AI要約本論文HalluSAEは、スパースオートエンコーダ(SAE)を用いて大規模言語モデルの内部表現から幻覚に関連する特徴を抽出し、幻覚の検出を行う手法を提案する。既存手法より高精度に幻覚を識別でき、解釈可能性も向上させる。

EN HalluSAE proposes using sparse auto-encoders to extract hallucination-related features from LLM internal representations, enabling more accurate and interpretable detection of hallucinations compared to existing methods.

#arxiv #paper #hallucination-detection #sparse-autoencoders

arxiv.org →

fallback

#hallucination-detection page 1/1 · 1 total

本論文HalluSAEは、スパースオートエンコーダ(SAE)を用いて大規模言語モデルの内部表現から幻覚に関連する特徴を抽出し、幻覚の検出を行… HalluSAE: Detecting Hallucinations in Large Language Models via Sparse Auto-Encoders