#formal-verification — TECH Dashboard

#formal-verification page 1/1 · 3 total

YESTERDAY 3 entries

paper research 1d ago ·

arxiv-cs-se

VeriContest: 検証可能なコード生成のための競技プログラミングベンチマーク VeriContest: A Competitive-Programming Benchmark for Verifiable Code Generation

AI要約 VeriContestは、競技プログラミング問題を題材に、LLMが生成するコードの形式的検証可能性を評価する新しいベンチマークである。実行ベースのテストではなく、仕様との整合性を検証することで、より厳密にLLMのコード生成能力を測定する。

EN VeriContest is a new benchmark that evaluates LLMs' verifiable code generation using competitive programming problems, measuring formal correctness against specifications rather than relying on test execution alone.

#arxiv #benchmark #paper #formal-verification

arxiv.org →

paper research 1d ago ·

arxiv-cs-se

AutoSOUP: コンポーネント単位のメモリ安全性検証向け自動ユニット証明生成 AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification

AI要約 AutoSOUPは、Cコンポーネントのメモリ安全性を検証するためのユニット証明(unit proof)を自動生成する手法を提案する。関数境界での前提条件や入力モデルを推論し、形式検証ツールの適用コストを下げることを目指す。

EN AutoSOUP proposes automated generation of unit proofs for component-level memory-safety verification of C code, inferring preconditions and input models to reduce the manual effort required to apply bounded model checkers and similar formal tools.

#arxiv #paper #formal-verification #memory-safety

arxiv.org →

paper research 1d ago ·

arxiv-cs-se

アラインメント非依存のAI安全保証:封じ込め検証という新提案 Containment Verification: AI Safety Guarantees Independent of Alignment

AI要約本論文はAIの価値整合(アラインメント)に依存せず、能力そのものを制限することで安全性を担保する「封じ込め検証」という枠組みを提案する。整合性証明が困難な高度AIに対し、形式的な能力上限の検証を代替手段として位置づける内容と見られる。

EN This paper proposes containment verification as an AI safety paradigm that provides guarantees independent of alignment, focusing on formally bounding a system's capabilities rather than proving its values are aligned with human intent.

#agent #arxiv #paper #ai-safety

arxiv.org →