Timeline · p.2 — TECH Dashboard

Timeline page 2/16 · 459 total

TODAY 30 entries

NEW paper research 5h ago · arxiv-cs-cl

本論文LiFTは、大規模言語モデルにおける指示ファインチューニングが、縦断的(時系列)モデリングの文脈内学習能力を向上させるかを検証する LiFT: Does Instruction Fine-Tuning Improve In-Context Learning for Longitudinal Modelling by Large Language Models?

AI要約本論文LiFTは、大規模言語モデルにおける指示ファインチューニングが、縦断的(時系列)モデリングの文脈内学習能力を向上させるかを検証する。指示調整モデルとベースモデルを比較し、長期的なデータパターン把握への影響を評価した。

EN LiFT investigates whether instruction fine-tuning improves in-context learning performance of large language models on longitudinal modelling tasks, comparing instruction-tuned and base models on capturing temporal data patterns.

#arxiv #paper #instruction-tuning #in-context-learning

arxiv.org →

fallback

NEW paper research 5h ago · arxiv-cs-cl

QIAS 2026共有タスク向けに、アラビア語イスラム相続法推論のため多段階QLoRAファインチューニングを適用した研究 QU-NLP at QIAS 2026: Multi-Stage QLoRA Fine-Tuning for Arabic Islamic Inheritance Reasoning

AI要約 QIAS 2026共有タスク向けに、アラビア語イスラム相続法推論のため多段階QLoRAファインチューニングを適用した研究。段階的な学習戦略により、複雑な法的推論タスクで高精度を達成した。

EN QU-NLP's submission to QIAS 2026 applies multi-stage QLoRA fine-tuning for Arabic Islamic inheritance reasoning, achieving strong performance on complex legal reasoning tasks through progressive training.

#arxiv #paper #qlora #arabic-nlp

arxiv.org →

fallback

NEW paper research 5h ago · arxiv-cs-cl

本論文は大規模言語モデルの幾何学問題に対する内部表現の頑健性を測定する手法を提案する Measuring Representation Robustness in Large Language Models for Geometry

AI要約本論文は大規模言語モデルの幾何学問題に対する内部表現の頑健性を測定する手法を提案する。問題文の意味を保った言い換えを与えた際に中間層の埋め込みがどの程度安定するかを分析し、LLMの推論の脆弱性を定量化する。

EN This paper proposes a method for measuring the robustness of internal representations in large language models when solving geometry problems, analyzing how embeddings shift under semantically equivalent rephrasings to quantify reasoning fragility.

#arxiv #paper #llm #geometry

arxiv.org →

fallback

NEW paper research 5h ago · arxiv-cs-cl

生物医学知識を言語モデルに注入する二つの手法、継続事前学習とGraphRAGを比較した研究 Injecting Structured Biomedical Knowledge into Language Models: Continual Pretraining vs. GraphRAG

AI要約生物医学知識を言語モデルに注入する二つの手法、継続事前学習とGraphRAGを比較した研究。構造化された医療知識グラフの活用法を検証し、それぞれの性能や適用場面の違いを評価している。

EN This paper compares two approaches for injecting structured biomedical knowledge into language models: continual pretraining and GraphRAG, evaluating their respective performance and use cases.

#arxiv #paper #biomedical-nlp #graphrag

arxiv.org →

fallback

NEW paper research 5h ago · arxiv-cs-cl

本論文HalluSAEは、スパースオートエンコーダ(SAE)を用いて大規模言語モデルの内部表現から幻覚に関連する特徴を抽出し、幻覚の検出を行… HalluSAE: Detecting Hallucinations in Large Language Models via Sparse Auto-Encoders

AI要約本論文HalluSAEは、スパースオートエンコーダ(SAE)を用いて大規模言語モデルの内部表現から幻覚に関連する特徴を抽出し、幻覚の検出を行う手法を提案する。既存手法より高精度に幻覚を識別でき、解釈可能性も向上させる。

EN HalluSAE proposes using sparse auto-encoders to extract hallucination-related features from LLM internal representations, enabling more accurate and interpretable detection of hallucinations compared to existing methods.

#arxiv #paper #hallucination-detection #sparse-autoencoders

arxiv.org →

fallback

NEW paper research 5h ago · arxiv-cs-cl

SynopticBench: Evaluating Vision-Language Models on Generating Weather Forecast Discussions of the Future SynopticBench: Evaluating Vision-Language Models on Generating Weather Forecast Discussions of the Future

EN arXiv:2604.16451v1 Announce Type: new Abstract: Recent advances in visual-language models (VLMs) have led to significant improvements in a plethora of complex multimodal tasks like image captioning, r

#arxiv #paper

arxiv.org →

fallback

NEW paper research 5h ago · arxiv-cs-cl

EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions

EN arXiv:2604.16456v1 Announce Type: new Abstract: Real-time voice assistants must revise task state when users interrupt mid-response, but existing spoken-dialog benchmarks largely evaluate turn-based i

#arxiv #benchmark #paper

arxiv.org →

fallback