Cloud Run × MCP のホスト先がこの 1 年で 3 回変わったので整理してみた Cloud Run × MCP のホスト先がこの 1 年で 3 回変わったので整理してみた
- こんにちは! KDDIアイレットの取り組みとして6月22日〜7月3日の期間で開催中の「Google Cloud Next '26 / Google I/O やってみた系ブログリレー」、 本日は7日目の投稿です。
- 今回は Next '26 で
AIエージェントと外部ツールやデータをつなぐ共通規格として広がる MCP(Model Context Protocol)。そのサーバーを Google Cloud のサーバーレス基盤 Cloud Run 上で動かす際の「適切なホスト先・構成」が、この1年で3回も変わってきたという。KDDIアイレットが6月22日〜7月3日に開催中の「Google Cloud Next '26 / Google I/O やってみた系ブログリレー」の7日目として公開した記事は、その変遷を実体験ベースで振り返る内容だ。
MCP は2024年末に Anthropic が公開したオープンな仕様で、生成AIモデルが外部の機能やデータソースを呼び出すための標準的な手順を定める。当初はローカルでの標準入出力(stdio)を介した通信が中心だったが、その後リモート接続を想定した HTTP ベースの方式が加わり、Server-Sent Events(SSE)を用いる方式から、より扱いやすいとされる Streamable HTTP へと推奨が移っていった。この通信方式(トランスポート)の更新が、ホスティング構成を見直す主な要因の一つになったと見られる。
コンテナをそのままデプロイでき、リクエストに応じて自動でスケールする Cloud Run は、こうした HTTP ベースの MCP サーバーを動かす受け皿として相性がよい。一方で、認証やセッション管理、常時接続の扱いといった点は方式の変化に伴って考慮事項も変わるため、同じ「Cloud Run で動かす」でも具体的な設計は時期によって異なってくる。記事はこうした仕様とプラットフォーム双方の進化を踏まえ、3度にわたって構成を変えてきた経緯と、それぞれの段階での選択理由を整理しているとみられる。
こんにちは! KDDIアイレットの取り組みとして6月22日〜7月3日の期間で開催中の「Google Cloud Next '26 / Google I/O やってみた系ブログリレー」、 本日は7日目の投稿です。
MCP をめぐっては、Google が Agent2Agent(A2A)などエージェント連携の取り組みを進めるほか、OpenAI なども対応を表明しており、各クラウドでホスティングの作法が固まりつつある段階だ。仕様やツールの更新が続く現状では、過去の構成をそのまま流用するのではなく、最新の公式ドキュメントを確認しながら構成を選ぶ姿勢が引き続き重要になりそうだ。
The Model Context Protocol (MCP) has quickly become one of the most discussed standards for connecting AI assistants to external tools and data, and running MCP servers on managed infrastructure such as Google Cloud Run is an increasingly common pattern. This article, published as the seventh entry in KDDI iret's "Google Cloud Next '26 / Google I/O hands-on blog relay" running from June 22 to July 3, looks back at how the recommended way to host an MCP server on Cloud Run has shifted roughly three times over the past year and tries to make sense of that churn.
The reason this matters is practical. MCP, originally published by Anthropic in late 2024, defines how a client such as an AI agent discovers and calls tools, reads resources, and exchanges prompts with a server. Early implementations were designed mainly for local use, communicating over standard input and output (stdio), which works well when the server runs on the same machine as the client. Hosting a server remotely, however, requires a network transport, and that is where Cloud Run enters the picture: it offers a serverless way to run a container, scale it to zero, and expose it behind an HTTPS endpoint without managing servers directly.
The first phase of remote hosting relied on the HTTP-plus-Server-Sent-Events (SSE) transport defined in the original specification. In this model the client opened a long-lived SSE connection to receive messages from the server while sending requests over separate HTTP POST calls. On Cloud Run this approach was workable but awkward, because long-lived streaming connections interact with request timeouts, instance concurrency, and the platform's autoscaling behavior in ways that are easy to misconfigure.
The second phase arrived when the MCP specification introduced the Streamable HTTP transport in 2025, intended to replace the older SSE approach. Streamable HTTP consolidates communication onto a single endpoint and allows responses to be returned either as ordinary HTTP responses or as streams when needed. This maps more naturally onto Cloud Run's request-response model and reduces the need to keep idle connections open, although stateful sessions still require care because Cloud Run may route subsequent requests to different instances unless session affinity is configured.
The third phase appears to reflect refinements rather than a wholesale reinvention, centering on authentication, session handling, and how Cloud Run itself has added capabilities aimed at agent and MCP workloads. Securing a publicly reachable MCP endpoint typically involves identity-aware access, OAuth-based authorization, or fronting the service with Google Cloud's IAM and Identity-Aware Proxy. The broader point the article seems to make is that anyone who built an MCP server on Cloud Run a year ago has likely had to revisit transport choices, connection handling, and security at least twice since.
For readers approaching this for the first time, a few prerequisite concepts help. Cloud Run distinguishes between request-based and instance-based behavior, and settings such as concurrency, minimum instances, and session affinity directly affect whether a streaming or stateful MCP server behaves correctly. It is also worth knowing the surrounding ecosystem: Anthropic, OpenAI, and Google have all moved toward agent tooling, with frameworks and SDKs that can consume MCP servers, and Google has promoted complementary efforts such as the Agent2Agent (A2A) protocol and its Agent Development Kit. These adjacent moves help explain why MCP hosting guidance has changed so often; the standard and the platforms beneath it are evolving in parallel.
The takeaway is less about a single correct configuration and more about the pace of change. Because the specification, the client implementations, and Cloud Run's own capabilities are all maturing at the same time, the "right" way to host an MCP server is a moving target. Teams adopting this pattern would be wise to isolate the transport layer in their code, document which specification version they target, and revisit their deployment when the next revision lands, which, given the past year, seems likely to happen again before long.
本ページの本文・要約は AI による自動生成です。正確性は元記事 (qiita.com) をご確認ください。