Claudeがツール結果を自ら幻覚して空回りした話 — トランスクリプトで原因を特定するまで A developer documents Claude fabricating tool call results and looping without progress, t…

Qiita Claude tag · qiita.com · 2026/06/30 20:11 · 3h ago · 📖 2 min

AI 3 行サマリ

Claudeがツール呼び出しの結果を自ら幻覚して実際の処理をスキップし無限ループに陥る問題を報告し、会話トランスクリプトを解析することで根本原因を特定する方法を紹介した実践的デバッグ記事。

English summary

A developer documents Claude fabricating tool call results and looping without progress, then shows how reading conversation transcripts helped identify and fix the root cause.

AIエージェントが外部ツールを呼び出して作業を進める「ツール利用（tool use / function calling）」は、コード実行やファイル操作、API連携を自動化するうえで欠かせない仕組みになっている。今回紹介する実践的なデバッグ記事は、その中核でClaudeが「ツールの実行結果」を自ら捏造（幻覚）し、実際の処理をスキップしたまま無限ループに陥った事例を取り上げ、会話トランスクリプトの解析によって根本原因を突き止める過程を記録したものだ。

問題の構造はこうだ。本来、エージェントはツール呼び出しを生成し、それを受け取ったランタイムが実際にツールを実行して結果をモデルに返す。ところが報告された事例では、Claudeがツールを呼び出すべき場面で、あたかも実行が完了したかのような架空の出力を自ら本文に書き込んでしまう。結果として実処理は行われず、モデルは存在しない成果をもとに次の判断を下し、同じ手順を延々と繰り返す「空回り」が発生した。

著者が突破口としたのが、モデルとランタイムがやり取りした生のトランスクリプトを読み解く手法だ。ユーザーから見える最終的な応答だけを眺めても、どこで処理が断絶したかは分かりにくい。一方、ツール呼び出しメッセージとツール結果メッセージの並び順、ロール（role）の割り当て、メッセージ境界を逐一確認すると、モデルが「ツール結果」を自分の発話として生成していた箇所が浮かび上がる。プロンプトやメッセージ整形の不備が引き金になった可能性が指摘されている。

この種の幻覚は、Claude固有というより、ツール利用を行うLLMエージェント全般で起こりうる課題として理解しておきたい。Anthropicが提唱したMCP（Model Context Protocol）やOpenAIのfunction callingなど、外部連携の標準化は進む一方、モデルが結果を捏造するリスクはフレームワーク側の検証や整形に依存する部分が大きい。

実務的な教訓として、トランスクリプトのロギングと検証は、エージェントの不可解な挙動を切り分けるうえで有効な手段だと言える。結果メッセージが正しいロールで挿入されているか、ツール呼び出しと結果が確実に対応しているかを点検する習慣が、空回りの予防につながると見られる。

Large language models that power agentic workflows increasingly depend on tool use, the ability to call external functions, query APIs, or run code and then fold the returned data back into their reasoning. A recent write-up from a developer documents a failure mode in which Claude appeared to fabricate the result of a tool call rather than wait for the real output, sending the agent into an unproductive loop. The case is a useful reminder that when a model invents the answer it was supposed to receive from a tool, every subsequent step rests on fiction, and the agent can keep working while accomplishing nothing.

The reported symptom was an agent that never converged. According to the account, Claude would emit what looked like a tool call, then immediately continue as though a result had come back, narrating an outcome that no tool had actually produced. Because the real processing was skipped, the state the model expected never materialized, so on the next turn it tried again, produced another hallucinated result, and repeated. From the outside this looks like a stalled or looping agent that consumes tokens and time without making progress, which is one of the more frustrating bugs to diagnose because the model's output reads as confident and coherent.

To find the cause, the author turned to the conversation transcript, the full ordered record of messages exchanged between the user, the assistant, and the tool layer. In a properly wired tool-use setup, the model issues a structured tool-call request, the host application executes it, and the genuine output is appended back into the context as a distinct message with the correct role before control returns to the model. Reading the raw transcript makes it possible to see exactly what the model received rather than what it was supposed to receive. The article frames this transcript inspection as the key debugging technique: instead of guessing from the final answer, you trace the message sequence to locate where the expected tool result is missing, malformed, or attributed to the wrong role.

What the transcript appears to reveal in such cases is a gap between the model's request and the actual injection of the result. If the tool output is never inserted, is placed in the wrong position, or is labeled so the model treats it as its own prior text rather than authoritative external data, the model is effectively left to fill the void. Language models are trained to produce plausible continuations, so faced with a missing tool result they will often generate one that looks right. The fix, broadly, is to ensure the orchestration code actually runs the requested tool, returns the result in the format and role the API expects, and only then lets the model continue, rather than allowing it to proceed on an assumed outcome.

This pattern is not unique to Claude. The same class of bug can affect any function-calling system, including OpenAI's tool-calling interface and open frameworks such as LangChain, LlamaIndex, or the Model Context Protocol, where a mismatch between the model's expectations and the host's message handling can leave the model improvising. Anthropic's own documentation emphasizes returning tool results as properly structured tool-result blocks tied to the originating tool-use ID, and deviations from that contract are a common source of trouble. Guardrails that help include validating that every tool call is matched by a real result before the next model turn, capping the number of iterations in an agent loop to avoid runaway repetition, and logging transcripts so failures can be reconstructed after the fact.

The broader takeaway is about observability. As agents grow more autonomous and chain many tool calls together, the surface where things can silently go wrong expands, and a polished final response can mask a broken intermediate step. Treating the transcript as the primary source of truth, rather than the model's narrated summary, gives developers a concrete artifact to inspect. The article is a practical, single-case report rather than a formal study, but its core lesson generalizes: when an agent loops or behaves strangely, read the actual messages it sent and received, because the discrepancy is usually visible there long before it shows up in the output.