TECH Dashboard — Pulse of the AI Ecosystem

LIVE · 05/12

cursorRELCursor in Microsoft TeamsCursor in Microsoft Teams[cursor-changelog]researchContext-Augmented Code Generation: How Product Context Improves AI Coding Agent Decision Compliance by 49%Context-Augmented Code Generation: How Product Context Improves AI Coding Agent Decision Compliance by 49%[arxiv-cs-se]researchComputer Use at the Edge of the Statistical PrecipiceComputer Use at the Edge of the Statistical Precipice[arxiv-cs-se]researchExecution Envelopes: A Shared Admission Contract for Backend AI Execution RequestsExecution Envelopes: A Shared Admission Contract for Backend AI Execution Requests[arxiv-cs-se]researchDo not copy and paste! Rewriting strategies for code retrievalDo not copy and paste! Rewriting strategies for code retrieval[arxiv-cs-se]researchMazocarta: A Seeded Procedural Deckbuilder for Instrumented Game DevelopmentMazocarta: A Seeded Procedural Deckbuilder for Instrumented Game Development[arxiv-cs-se]researchWhat Software Engineering Looks Like to AI Agents? -- An Empirical Study of AI-Only Technical Discourse on MoltBookWhat Software Engineering Looks Like to AI Agents? -- An Empirical Study of AI-Only Technical Discourse on MoltBook[arxiv-cs-se]researchA Dataset of Agentic AI Coding Tool ConfigurationsA Dataset of Agentic AI Coding Tool Configurations[arxiv-cs-se]researchVeriContest: A Competitive-Programming Benchmark for Verifiable Code GenerationVeriContest: A Competitive-Programming Benchmark for Verifiable Code Generation[arxiv-cs-se]researchEvidenT: An Evidence-Preserving Framework for Iterative System-Level Package RepairEvidenT: An Evidence-Preserving Framework for Iterative System-Level Package Repair[arxiv-cs-se]researchSemantic Voting: Execution-Grounded Consensus for LLM Code GenerationSemantic Voting: Execution-Grounded Consensus for LLM Code Generation[arxiv-cs-se]researchA Learning Method for Symbolic Systems Using Large Language ModelsA Learning Method for Symbolic Systems Using Large Language Models[arxiv-cs-se]researchDebugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering AgentsDebugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering Agents[arxiv-cs-se]researchUsing Semantic Distance to Estimate Uncertainty in LLM-Based Code GenerationUsing Semantic Distance to Estimate Uncertainty in LLM-Based Code Generation[arxiv-cs-se]researchParityFuzz: Finding Inconsistencies across Solidity Compilers via Fine-Grained Mutation and Differential AnalysisParityFuzz: Finding Inconsistencies across Solidity Compilers via Fine-Grained Mutation and Differential Analysis[arxiv-cs-se]researchEvaluating LLM-Generated Code: A Benchmark and Developer StudyEvaluating LLM-Generated Code: A Benchmark and Developer Study[arxiv-cs-se]researchGenerating Complex Code Analyzers from Natural Language QuestionsGenerating Complex Code Analyzers from Natural Language Questions[arxiv-cs-se]researchPrediction Model of Motivators and Demotivators of Integrating Large Language Models in Software Engineering Education: An Empirical StudyPrediction Model of Motivators and Demotivators of Integrating Large Language Models in Software Engineering Education: An Empirical Study[arxiv-cs-se]researchMACAA: Belief-Revision Multi-Agent Reasoning for Open-World Code Authorship VerificationMACAA: Belief-Revision Multi-Agent Reasoning for Open-World Code Authorship Verification[arxiv-cs-se]researchConCovUp: Effective Agent-Based Test Driver Generation for Concurrency TestingConCovUp: Effective Agent-Based Test Driver Generation for Concurrency Testing[arxiv-cs-se]researchZoom, Don't Wander: Why Regional Search Outperforms Pareto Reasoning and Global Optimization in Budget-Constrained SBSEZoom, Don't Wander: Why Regional Search Outperforms Pareto Reasoning and Global Optimization in Budget-Constrained SBSE[arxiv-cs-se]researchTrajectory Supervision for Continual Tool-Use Learning in LLMsTrajectory Supervision for Continual Tool-Use Learning in LLMs[arxiv-cs-se]researchEvaluating Tool Cloning in Agentic-AI EcosystemsEvaluating Tool Cloning in Agentic-AI Ecosystems[arxiv-cs-se]researchDeterministic vs. LLM-Controlled Orchestration for COBOL-to-Python ModernizationDeterministic vs. LLM-Controlled Orchestration for COBOL-to-Python Modernization[arxiv-cs-se]

Today 56

Total 246

Major 12

Active sources 12/51

Updated just now

CodeQL 2.25.3 adds Swift 6.3 support

Featured · 注目 Featured

CodeQL 2.25.3 adds Swift 6.3 support CodeQL 2.25.3 adds Swift 6.3 support

CodeQL is the static analysis engine behind GitHub code scanning, which finds and remediates security issues in your code. We’ve recently released CodeQL 2.25.3, which adds support for Swift 6.3,… The

changelog

github-changelog · 3d ago

Daily Summary

今日の更新

Today's Updates

Today 56 ▼ 81%

Yesterday 288

7-day 502

Last 7 days

34

48

26

14

36

288

56

05/06 05/07 05/08 05/09 05/10 05/11 05/12

Top categories · 7d

主要な更新 Top stories 05/12 · 10 件

🔥 Today's Top 3 importance × recency

CodeQL 2.25.3 adds Swift 6.3 support CodeQL 2.25.3 adds Swift 6.3 support github-changelog 3d ago
Cursor in Microsoft Teams Cursor in Microsoft Teams cursor-changelog 7h ago
Create repositories on the go with GitHub Mobile Create repositories on the go with GitHub Mobile github-changelog 21h ago

Timeline 246 total · page 1/9

TODAY 30 entries

NEW blog claude 58m ago ·

qiita-claude

OpenAI CodexとClaude Codeの「AIコーディング支援のコスト感」の違い

AI要約はじめに OpenAI Codex と Claude Code を両方使っていると、単純な「どちらが賢いか」とは別に、かなり現実的な差が見えてきます。それがコスト感です。ここでいうコストは、月額料金だけではありません。どれくらい作

qiita.com →

OpenAI CodexとClaude Codeの「AIコーディング支援のコスト感」の違い

og

NEW blog claude 1h ago ·

qiita-claude

[備忘録-1] VScode内でHTMLをGo Liveしようとしたがうまくいかなかった。

AI要約本題: VScodeで初歩的なカウントアプリを作って学習中に、次の問題に直面した。 Live Serveer機能でHTMLを記述したファイルを表示しようとしても表示されなくなってしまった。 ※最初の投稿かつ備忘録としての書きなぐりためいろい

qiita.com →

[備忘録-1] VScode内でHTMLをGo Liveしようとしたがうまくいかなかった。

og

NEW paper research 3h ago ·

arxiv-cs-se

Context-Augmented Code Generation: How Product Context Improves AI Coding Agent Decision Compliance by 49% Context-Augmented Code Generation: How Product Context Improves AI Coding Agent Decision Compliance by 49%

EN arXiv:2605.08112v1 Announce Type: new Abstract: AI coding agents powered by large language models can read codebases and produce functional code, but they routinely violate team-specific product decis

EN arXiv:2605.08112v1 Announce Type: new Abstract: AI coding agents powered by large language models can read codebases and produce functional code, but they routinely violate team-specific product decis

#agent #arxiv #enterprise #paper

arxiv.org →

og

NEW paper research 3h ago ·

arxiv-cs-se

Computer Use at the Edge of the Statistical Precipice Computer Use at the Edge of the Statistical Precipice

EN arXiv:2605.08261v1 Announce Type: new Abstract: Evaluating Computer Use Agents (CUAs) on interactive environments is fraught with methodological pitfalls that the field has yet to systematically addre

EN arXiv:2605.08261v1 Announce Type: new Abstract: Evaluating Computer Use Agents (CUAs) on interactive environments is fraught with methodological pitfalls that the field has yet to systematically addre

arxiv.org →

og

NEW paper research 3h ago ·

arxiv-cs-se

Execution Envelopes: A Shared Admission Contract for Backend AI Execution Requests Execution Envelopes: A Shared Admission Contract for Backend AI Execution Requests

EN arXiv:2605.08267v1 Announce Type: new Abstract: Enterprise AI backends increasingly admit heterogeneous execution requests across model deployment, inference, evaluation, data movement, and agentic wo

EN arXiv:2605.08267v1 Announce Type: new Abstract: Enterprise AI backends increasingly admit heterogeneous execution requests across model deployment, inference, evaluation, data movement, and agentic wo

#agent #arxiv #benchmark #enterprise

arxiv.org →

og

NEW paper research 3h ago ·

arxiv-cs-se

Do not copy and paste! Rewriting strategies for code retrieval Do not copy and paste! Rewriting strategies for code retrieval

EN arXiv:2605.08299v1 Announce Type: new Abstract: Embedding-based code retrieval often suffers when encoders overfit to surface syntax. Prior work mitigates this by using LLMs to rephrase queries and co

EN arXiv:2605.08299v1 Announce Type: new Abstract: Embedding-based code retrieval often suffers when encoders overfit to surface syntax. Prior work mitigates this by using LLMs to rephrase queries and co

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Mazocarta: A Seeded Procedural Deckbuilder for Instrumented Game Development Mazocarta: A Seeded Procedural Deckbuilder for Instrumented Game Development

EN arXiv:2605.08319v1 Announce Type: new Abstract: Mazocarta is a seeded procedural tactical deckbuilder implemented in Rust, compiled to WebAssembly for browser play, and executable natively for simulat

EN arXiv:2605.08319v1 Announce Type: new Abstract: Mazocarta is a seeded procedural tactical deckbuilder implemented in Rust, compiled to WebAssembly for browser play, and executable natively for simulat

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

What Software Engineering Looks Like to AI Agents? -- An Empirical Study of AI-Only Technical Discourse on MoltBook What Software Engineering Looks Like to AI Agents? -- An Empirical Study of AI-Only Technical Discourse on MoltBook

EN arXiv:2605.08380v1 Announce Type: new Abstract: AI agents are increasingly framed as software-engineering teammates, yet most research studies them inside human-centered workflows. Little is known abo

EN arXiv:2605.08380v1 Announce Type: new Abstract: AI agents are increasingly framed as software-engineering teammates, yet most research studies them inside human-centered workflows. Little is known abo

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

A Dataset of Agentic AI Coding Tool Configurations A Dataset of Agentic AI Coding Tool Configurations

EN arXiv:2605.08435v1 Announce Type: new Abstract: Agentic AI coding tools such as Claude Code and OpenAI Codex execute multi-step coding tasks with limited human oversight. To steer these tools, develop

EN arXiv:2605.08435v1 Announce Type: new Abstract: Agentic AI coding tools such as Claude Code and OpenAI Codex execute multi-step coding tasks with limited human oversight. To steer these tools, develop

#agent #arxiv #paper

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

VeriContest: A Competitive-Programming Benchmark for Verifiable Code Generation VeriContest: A Competitive-Programming Benchmark for Verifiable Code Generation

EN arXiv:2605.08553v1 Announce Type: new Abstract: Large language models can generate useful code from natural language, but their outputs come without correctness guarantees. Verifiable code generation

EN arXiv:2605.08553v1 Announce Type: new Abstract: Large language models can generate useful code from natural language, but their outputs come without correctness guarantees. Verifiable code generation

#arxiv #benchmark #paper

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

EvidenT: An Evidence-Preserving Framework for Iterative System-Level Package Repair EvidenT: An Evidence-Preserving Framework for Iterative System-Level Package Repair

EN arXiv:2605.08621v1 Announce Type: new Abstract: Frequent toolchain updates and growing ISA diversity have made system-level software package repair increasingly important. Diagnosing and repairing bui

EN arXiv:2605.08621v1 Announce Type: new Abstract: Frequent toolchain updates and growing ISA diversity have made system-level software package repair increasingly important. Diagnosing and repairing bui

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Semantic Voting: Execution-Grounded Consensus for LLM Code Generation Semantic Voting: Execution-Grounded Consensus for LLM Code Generation

EN arXiv:2605.08680v1 Announce Type: new Abstract: LLM code-generation pipelines often sample multiple candidates and select one final answer without access to a complete oracle. Existing pipelines mix t

EN arXiv:2605.08680v1 Announce Type: new Abstract: LLM code-generation pipelines often sample multiple candidates and select one final answer without access to a complete oracle. Existing pipelines mix t

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

A Learning Method for Symbolic Systems Using Large Language Models A Learning Method for Symbolic Systems Using Large Language Models

EN arXiv:2605.08694v1 Announce Type: new Abstract: Automated theorem proving is essential for the formal verification of safety-critical systems. As the corpus of formal proofs grows, a natural paradigm

EN arXiv:2605.08694v1 Announce Type: new Abstract: Automated theorem proving is essential for the formal verification of safety-critical systems. As the corpus of formal proofs grows, a natural paradigm

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Debugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering Agents Debugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering Agents

EN arXiv:2605.08717v1 Announce Type: new Abstract: Software engineering agents are increasingly deployed in evaluable engineering environments, yet post-failure recovery remains costly, manual, and ad ho

EN arXiv:2605.08717v1 Announce Type: new Abstract: Software engineering agents are increasingly deployed in evaluable engineering environments, yet post-failure recovery remains costly, manual, and ad ho

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Using Semantic Distance to Estimate Uncertainty in LLM-Based Code Generation Using Semantic Distance to Estimate Uncertainty in LLM-Based Code Generation

EN arXiv:2605.09023v1 Announce Type: new Abstract: LLMs show strong performance in code generation, but their outputs lack correctness guarantees. Sample-based uncertainty estimators address this by gene

EN arXiv:2605.09023v1 Announce Type: new Abstract: LLMs show strong performance in code generation, but their outputs lack correctness guarantees. Sample-based uncertainty estimators address this by gene

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

ParityFuzz: Finding Inconsistencies across Solidity Compilers via Fine-Grained Mutation and Differential Analysis ParityFuzz: Finding Inconsistencies across Solidity Compilers via Fine-Grained Mutation and Differential Analysis

EN arXiv:2605.09051v1 Announce Type: new Abstract: The Solidity smart contract ecosystem has rapidly grown, leading to multiple compilers targeting different blockchain platforms or improving compilation

EN arXiv:2605.09051v1 Announce Type: new Abstract: The Solidity smart contract ecosystem has rapidly grown, leading to multiple compilers targeting different blockchain platforms or improving compilation

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Evaluating LLM-Generated Code: A Benchmark and Developer Study Evaluating LLM-Generated Code: A Benchmark and Developer Study

EN arXiv:2605.09059v1 Announce Type: new Abstract: Code generation is one of the tasks for which the use of Large Language Models is widely adopted and highly successful. Given this popularity, there are

EN arXiv:2605.09059v1 Announce Type: new Abstract: Code generation is one of the tasks for which the use of Large Language Models is widely adopted and highly successful. Given this popularity, there are

#arxiv #benchmark #paper

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Generating Complex Code Analyzers from Natural Language Questions Generating Complex Code Analyzers from Natural Language Questions

EN arXiv:2605.09304v1 Announce Type: new Abstract: Many software development tasks, such as implementing features and fixing bugs, begin with developers posing questions about a codebase. However, answer

EN arXiv:2605.09304v1 Announce Type: new Abstract: Many software development tasks, such as implementing features and fixing bugs, begin with developers posing questions about a codebase. However, answer

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Prediction Model of Motivators and Demotivators of Integrating Large Language Models in Software Engineering Education: An Empirical Study Prediction Model of Motivators and Demotivators of Integrating Large Language Models in Software Engineering Education: An Empirical Study

EN arXiv:2605.09393v1 Announce Type: new Abstract: Context: Large Language Models (LLMs) are increasingly influencing software engineering practice and education. While prior studies examine their techni

EN arXiv:2605.09393v1 Announce Type: new Abstract: Context: Large Language Models (LLMs) are increasingly influencing software engineering practice and education. While prior studies examine their techni

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

MACAA: Belief-Revision Multi-Agent Reasoning for Open-World Code Authorship Verification MACAA: Belief-Revision Multi-Agent Reasoning for Open-World Code Authorship Verification

EN arXiv:2605.09421v1 Announce Type: new Abstract: Code authorship attribution (CAA) supports software forensics, plagiarism detection, and intellectual property protection. However, existing supervised

EN arXiv:2605.09421v1 Announce Type: new Abstract: Code authorship attribution (CAA) supports software forensics, plagiarism detection, and intellectual property protection. However, existing supervised

#agent #arxiv #paper

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

ConCovUp: Effective Agent-Based Test Driver Generation for Concurrency Testing ConCovUp: Effective Agent-Based Test Driver Generation for Concurrency Testing

EN arXiv:2605.09573v1 Announce Type: new Abstract: Concurrency testing is essential to improve the reliability and security of multi-threaded programs. Dynamic analysis tools, such as TSan, depend on hig

EN arXiv:2605.09573v1 Announce Type: new Abstract: Concurrency testing is essential to improve the reliability and security of multi-threaded programs. Dynamic analysis tools, such as TSan, depend on hig

#agent #arxiv #paper

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Zoom, Don't Wander: Why Regional Search Outperforms Pareto Reasoning and Global Optimization in Budget-Constrained SBSE Zoom, Don't Wander: Why Regional Search Outperforms Pareto Reasoning and Global Optimization in Budget-Constrained SBSE

EN arXiv:2605.09658v1 Announce Type: new Abstract: Traditional Search-Based Software Engineering (SBSE) assumes global search and full Pareto exploration are essential. We offer the following negative re

EN arXiv:2605.09658v1 Announce Type: new Abstract: Traditional Search-Based Software Engineering (SBSE) assumes global search and full Pareto exploration are essential. We offer the following negative re

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Trajectory Supervision for Continual Tool-Use Learning in LLMs Trajectory Supervision for Continual Tool-Use Learning in LLMs

EN arXiv:2605.09734v1 Announce Type: new Abstract: Most language-model training data shows final artifacts, not the process that produced them. We study a tractable version of this question in tool use:

EN arXiv:2605.09734v1 Announce Type: new Abstract: Most language-model training data shows final artifacts, not the process that produced them. We study a tractable version of this question in tool use:

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Evaluating Tool Cloning in Agentic-AI Ecosystems Evaluating Tool Cloning in Agentic-AI Ecosystems

EN arXiv:2605.09817v1 Announce Type: new Abstract: Agent tools are becoming a core interface through which LLM agents access external data, services, and execution environments. As these tools are distri

EN arXiv:2605.09817v1 Announce Type: new Abstract: Agent tools are becoming a core interface through which LLM agents access external data, services, and execution environments. As these tools are distri

#agent #arxiv #paper

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Deterministic vs. LLM-Controlled Orchestration for COBOL-to-Python Modernization Deterministic vs. LLM-Controlled Orchestration for COBOL-to-Python Modernization

EN arXiv:2605.09894v1 Announce Type: new Abstract: Modernizing legacy COBOL systems remains difficult due to scarce expertise, large and long-lived codebases, and strict correctness requirements. Recent

EN arXiv:2605.09894v1 Announce Type: new Abstract: Modernizing legacy COBOL systems remains difficult due to scarce expertise, large and long-lived codebases, and strict correctness requirements. Recent

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

Instruction Adherence in Coding Agent Configuration Files: A Factorial Study of Four File-Structure Variables Instruction Adherence in Coding Agent Configuration Files: A Factorial Study of Four File-Structure Variables

EN arXiv:2605.10039v1 Announce Type: new Abstract: Frontier coding agents read configuration files (CLAUDE.md, AGENTS.md, Cursor Rules) at session start and are expected to follow the conventions inside

EN arXiv:2605.10039v1 Announce Type: new Abstract: Frontier coding agents read configuration files (CLAUDE.md, AGENTS.md, Cursor Rules) at session start and are expected to follow the conventions inside

#agent #arxiv #paper

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

MARGIN: Margin-Aware Regularized Geometry for Imbalanced Vulnerability Detection MARGIN: Margin-Aware Regularized Geometry for Imbalanced Vulnerability Detection

EN arXiv:2605.10240v1 Announce Type: new Abstract: Software vulnerability detection is critical for ensuring software security and reliability. Despite recent advances in deep learning, real-world vulner

EN arXiv:2605.10240v1 Announce Type: new Abstract: Software vulnerability detection is critical for ensuring software security and reliability. Despite recent advances in deep learning, real-world vulner

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

DREAMS: Modelling Support for Research into Engineering and Artistic Design DREAMS: Modelling Support for Research into Engineering and Artistic Design

EN arXiv:2605.10382v1 Announce Type: new Abstract: Design Research Methodology (DRM) supports systematic design research through representations such as Reference Models and Impact Models. However, the p

EN arXiv:2605.10382v1 Announce Type: new Abstract: Design Research Methodology (DRM) supports systematic design research through representations such as Reference Models and Impact Models. However, the p

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

VISOR: A Vision-Language Model-based Test Oracle for Testing Robot VISOR: A Vision-Language Model-based Test Oracle for Testing Robot

EN arXiv:2605.10408v1 Announce Type: new Abstract: Testing robots requires assessing whether they perform their intended tasks correctly, dependably, and with high quality, a challenge known as the test

EN arXiv:2605.10408v1 Announce Type: new Abstract: Testing robots requires assessing whether they perform their intended tasks correctly, dependably, and with high quality, a challenge known as the test

arxiv.org →

NEW paper research 3h ago ·

arxiv-cs-se

CrackMeBench: Binary Reverse Engineering for Agents CrackMeBench: Binary Reverse Engineering for Agents

EN arXiv:2605.10597v1 Announce Type: new Abstract: Benchmarks for coding agents increasingly measure source-level software repair, and cybersecurity benchmarks increasingly measure broad capture-the-flag

EN arXiv:2605.10597v1 Announce Type: new Abstract: Benchmarks for coding agents increasingly measure source-level software repair, and cybersecurity benchmarks increasingly measure broad capture-the-flag

arxiv.org →