#reasoning 19 total

fallback

paper research 2w ago ·

arxiv-cs-lg

VeriGate: 検証器によるゲーティングでGRPOのステップレベル監督を強化する手法 VeriGate: Verifier-Gated Step-Level Supervision for GRPO

重要度 Medium Medium priority 重要度 Medium · 論文/研究 · Papers / Benchmarks Medium priority · paper/research · Papers / Benchmarks 公開 6月1日 Published Jun 1

AI要約 VeriGateは、GRPO（グループ相対方策最適化）における結果報酬の粗さを補うため、ステップレベルの検証器ゲーティングを導入した手法。推論モデルの学習効率と精度を高めることを目指している。

EN arXiv:2605.30451v1 Announce Type: new Abstract: Group Relative Policy Optimization (GRPO) is an effective recipe for training reasoning models with verifier-based outcome rewards, but its supervision

#arxiv #paper #grpo +5

fallback

Thu, May 28 3 entries

paper research 3w ago ·

arxiv-cs-ai

LaneRoPE: 協調並列推論・生成のための位置エンコーディング LaneRoPE: Positional Encoding for Collaborative Parallel Reasoning and Generation

重要度 Medium Medium priority 重要度 Medium · 論文/研究 · Papers / Benchmarks Medium priority · paper/research · Papers / Benchmarks 公開 5月28日 Published May 28

AI要約複数シーケンスを並列生成するLLMのテスト時スケーリングに向け、専用の位置エンコーディング手法LaneRoPEを提案した研究論文。

EN arXiv:2605.27570v1 Announce Type: new Abstract: Parallel LLM test-time scaling techniques (e.g., best-of-$N$) require drawing $N>1$ sequences conditioned on the same input prompt. These methods boost

#arxiv #paper #llm +5

og fallback

paper research 3w ago ·

arxiv-cs-ai

動的に変化する規範を用いた推論と計画 Reasoning and Planning with Dynamically Changing Norms

重要度 Medium Medium priority 重要度 Medium · 論文/研究 · Papers / Benchmarks Medium priority · paper/research · Papers / Benchmarks 公開 5月28日 Published May 28

AI要約 AIエージェントが人間の規範をリアルタイムで把握し、計画に反映させる手法を提案した研究論文。

EN arXiv:2605.27622v1 Announce Type: new Abstract: To safely interact with humans, AI agents must both know our norms and consider them during planning. However, such norm-guided planning has been less e

#arxiv #paper #norm-guided-planning +4

developers.googleblog.com →

og fallback

blog gemini 3w ago ·

google-developers

コミュニティがTunixとTPUを使ってGemmaに「思考」を学ばせた方法 How the community trained Gemma to "Think" with Tunix and TPUs

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Gemini / Gemma Medium priority · technical post · Gemini / Gemma 公開 5月28日 Published May 28

AI要約 KaggleのGoogle Tunixハッカソンで、開発者たちが小規模な非推論ベースモデルをTPUと限られた計算リソースで汎用推論エンジンへと変換した。

EN The Google Tunix Hackathon on Kaggle challenged developers to transform small, non-reasoning base models into general reasoning engines using Kaggle TPUs and a limited compute budget. The winning team

#google #open-model #gemma +5

fallback

Wed, May 27 1 entries

paper research 3w ago ·

arxiv-cs-cl

LLMが構造化知識でハルシネーションを起こす理由：線形化表現上の推論メカニズム分析 Why LLMs Hallucinate on Structured Knowledge: A Mechanistic Analysis of Reasoning over Linearized Representations

重要度 Medium Medium priority 重要度 Medium · 論文/研究 · Papers / Benchmarks Medium priority · paper/research · Papers / Benchmarks 公開 5月27日 Published May 27

AI要約グラフや表などの構造化知識を線形化してLLMに入力する際にハルシネーションが生じるメカニズムを機械的に分析した研究論文。

EN arXiv:2605.26362v1 Announce Type: new Abstract: In many reasoning tasks, large language models (LLMs) rely on structured external knowledge, such as graphs and tables, which is typically linearized in

#arxiv #paper #hallucination +5

og fallback

Tue, May 26 1 entries

paper research 3w ago ·

arxiv-cs-ai

どれだけ考えれば十分か？LLM推論における冗長性の定量化と理解 How Much Thinking is Enough? Quantifying and Understanding Redundancy in LLM Reasoning

重要度 Medium Medium priority 重要度 Medium · 論文/研究 · Papers / Benchmarks Medium priority · paper/research · Papers / Benchmarks 公開 5月26日 Published May 26

AI要約 LLMの長い思考チェーンに含まれる冗長性を定量化し、レイテンシ・GPU時間・エネルギーコストを削減する手法を研究した論文。

EN A research paper quantifying redundancy in LLM chain-of-thought reasoning, aiming to reduce latency, GPU time, and energy costs without sacrificing accuracy.

#arxiv #paper #chain-of-thought +4

og fallback

Fri, Feb 20 1 entries

NEW blog gemini 3mo ago ·

google-deepmind

Gemini 3.1 Pro登場、複雑タスク向けに推論力を強化 Gemini 3.1 Pro: A smarter model for your most complex tasks

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Gemini / Gemma Medium priority · technical post · Gemini / Gemma 公開 2月20日 Published Feb 20

AI要約 Google DeepMindは最新モデル「Gemini 3.1 Pro」を発表した。複雑な推論やコーディング、長文理解を中心に性能を底上げし、開発者やエンタープライズ向けの高度なタスク処理を狙う。Gemini 3シリーズの増分アップデートと位置付けられる。

EN 3.1 Pro is designed for tasks where a simple answer isn’t enough.

#deepmind #google #gemini-3 +3

Gemini 3.1 Pro: A smarter model for your most complex tasks

media fallback

Fri, Feb 13 1 entries

NEW blog gemini 4mo ago ·

google-deepmind

Gemini 3 Deep Think、科学・研究・エンジニアリングを加速 Gemini 3 Deep Think: Advancing science, research and engineering

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Gemini / Gemma Medium priority · technical post · Gemini / Gemma 公開 2月13日 Published Feb 13

AI要約 Google DeepMindは、Gemini 3 Proを拡張した高度推論モード「Deep Think」を発表した。並列思考と長時間推論で数学・科学・コーディング分野のベンチマークを刷新し、研究者やエンジニア向けに新たな問題解決能力を提供する。

EN Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.

#deepmind #google #gemini-3 +4

Gemini 3 Deep Think: Advancing science, research and engineering

media fallback

Tue, Feb 10 1 entries

NEW blog gemini 4mo ago ·

google-deepmind

Gemini Deep Thinkが数学・科学の発見を加速 Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Gemini / Gemma Medium priority · technical post · Gemini / Gemma 公開 2月10日 Published Feb 10

AI要約 GoogleのGemini Deep Thinkが数学者や科学者と連携し、長年未解決だった数学問題で新たな進展を達成した。並列推論技術により複雑な定理証明や科学的予想の検証を支援し、AIによる研究加速の実例を示した。

EN Research papers point to the growing impact of Deep Think across fields

#deepmind #google #deep-think +3

Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

media fallback

Wed, Jan 28 1 entries

blog local-llm 4mo ago ·

huggingface-blog

中国オープンソースAIエコシステムの設計思想：DeepSeekを超えて Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Local LLM / Open Models Medium priority · technical post · Local LLM / Open Models 公開 1月28日 Published Jan 28

AI要約 DeepSeek登場から1年、中国発オープンソースAIモデルのアーキテクチャ選択——MoE・長文処理・マルチモーダル・推論——を俯瞰分析。

EN Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

#huggingface #open-model #china +7

huggingface.co →

fallback

Thu, Dec 11 2 entries

🔥 HOT blog codex 6mo ago ·

openai-blog

GPT-5.2で科学と数学を前進させる Advancing science and math with GPT-5.2

重要度 High High priority 重要度 High · 技術記事 · OpenAI / Codex High priority · technical post · OpenAI / Codex 公開 12月11日 Published Dec 11

AI要約 OpenAIがGPT-5.2を発表。GPQA DiamondやFrontierMathなど主要ベンチマークで最高水準を達成し、科学・数学分野の推論能力を大幅に強化。

EN GPT-5.2 is OpenAI’s strongest model yet for math and science, setting new state-of-the-art results on benchmarks like GPQA Diamond and FrontierMath. This post shows how those gains translate into real

#benchmark #openai #gpt-5.2 +7

fallback

🔥 HOT blog codex 6mo ago ·

openai-blog

GPT-5.2を発表 Introducing GPT-5.2

重要度 High High priority 重要度 High · 技術記事 · OpenAI / Codex High priority · technical post · OpenAI / Codex 公開 12月11日 Published Dec 11

AI要約 OpenAIがGPT-5.2を発表。推論・長文理解・コーディング・ビジョン能力を強化した最先端フロンティアモデルで、ChatGPTおよびAPIで利用可能。

EN GPT-5.2 is our most advanced frontier model for everyday professional work, with state-of-the-art reasoning, long-context understanding, coding, and vision. Use it in ChatGPT and the OpenAI API to pow

#openai #gpt-5.2 #llm +6

fallback

Wed, Nov 19 1 entries

🔥 HOT NEW blog gemini 7mo ago ·

google-deepmind

Gemini 3が切り拓く新時代の知能、推論とマルチモーダルを大幅強化 A new era of intelligence with Gemini 3

重要度 High High priority 重要度 High · 技術記事 · Gemini / Gemma High priority · technical post · Gemini / Gemma 公開 11月19日 Published Nov 19

AI要約 Google DeepMindが最新フラッグシップモデル「Gemini 3」を発表。推論力、マルチモーダル理解、エージェント機能を大幅に強化し、検索やGeminiアプリ、開発者向けAPIに同時投入される。コーディング特化版「Gemini 3 Deep Think」も提供される見込み。

EN A new era of intelligence with Gemini 3

#deepmind #google #gemini-3 +4

fallback

Thu, Nov 13 1 entries

🔥 HOT blog codex 7mo ago ·

openai-blog

開発者向けGPT-5.1をAPIで公開 Introducing GPT-5.1 for developers

重要度 High High priority 重要度 High · 技術記事 · OpenAI / Codex High priority · technical post · OpenAI / Codex 公開 11月13日 Published Nov 13

AI要約 OpenAIがAPI向けにGPT-5.1を公開。適応的推論による高速化、拡張プロンプトキャッシュ、コーディング性能の向上などを提供。

EN GPT-5.1 is now available in the API, bringing faster adaptive reasoning, extended prompt caching, improved coding performance, and new apply_patch and shell tools.

#openai #gpt-5.1 #api +6

fallback

Thu, Aug 7 4 entries

🔥 HOT blog codex 10mo ago ·

openai-blog

開発者向けGPT-5をAPIで提供開始、コーディングと推論性能を大幅強化 Introducing GPT-5 for developers

重要度 High High priority 重要度 High · 技術記事 · OpenAI / Codex High priority · technical post · OpenAI / Codex 公開 8月7日 Published Aug 7

AI要約 OpenAIがAPIでGPT-5の提供を開始。コーディング・長文推論・指示追従が向上し、エージェント的タスクやツール利用にも最適化。

EN Introducing GPT-5 in our API platform—offering high reasoning performance, new controls for devs, and best-in-class results on real coding tasks.

#openai #gpt-5 #api +6

fallback

🔥 HOT blog codex 10mo ago ·

openai-blog

GPT-5の初公開：開発者が初めて触れる次世代モデル First look at GPT-5

重要度 High High priority 重要度 High · 技術記事 · OpenAI / Codex High priority · technical post · OpenAI / Codex 公開 8月7日 Published Aug 7

AI要約 OpenAIがGPT-5を初公開。推論・コーディング・マルチモーダル能力が大幅に向上し、開発者向けAPIへの統合も予定されている。

EN See how a group of leading developers use GPT-5 for the first time.

#openai #gpt-5 #llm +4

fallback

🔥 HOT blog codex 10mo ago ·

openai-blog

GPT-5 システムカード公開 GPT-5 System Card

重要度 High High priority 重要度 High · 技術記事 · OpenAI / Codex High priority · technical post · OpenAI / Codex 公開 8月7日 Published Aug 7

AI要約 OpenAIがGPT-5のシステムカードを公開。能力評価、安全性テスト、リスク軽減策、レッドチーミング結果などを詳述している。

EN This GPT-5 system card explains how a unified model routing system powers fast and smart responses using gpt-5-main, gpt-5-thinking, and lightweight versions like gpt-5-thinking-nano, optimized for di

#openai #gpt-5 #system-card +5

fallback

🔥 HOT blog codex 10mo ago ·

openai-blog

GPT-5を発表 Introducing GPT-5

重要度 High High priority 重要度 High · 技術記事 · OpenAI / Codex High priority · technical post · OpenAI / Codex 公開 8月7日 Published Aug 7

AI要約 OpenAIが最新モデルGPT-5を発表。コーディング・数学・推論など幅広い分野で従来モデルを大幅に上回る性能を持ち、ChatGPTおよびAPIで利用可能。

EN We are introducing GPT‑5, our best AI system yet. GPT‑5 is a significant leap in intelligence over all our previous models, featuring state-of-the-art performance across coding, math, writing, health,

#openai #gpt-5 #llm +6