スマホでローカルLLMにコードを書かせ、スマホアプリでビルドしてJetpack Composeアプリを作った話スマホでローカルLLMにコードを書かせ、スマホアプリでビルドしてJetpack Composeアプリを作った話

Zenn LLM tag · zenn.dev · 2026/06/29 06:30 · 4h ago · 📖 2 min

AI 3 行サマリ

投稿経緯去年の暮れからローカルLLMに興味を持ち、Qwen2.5-coder を起点にプロンプト制御やパラメータ調整の沼にハマっていました。
暇さえあれば、クラウド型AIでプロンプトの壁打ちをしては、スマホ上のローカルLLMにAndroid

スマートフォン上で完結する開発スタイルが、ローカルLLMの軽量化と推論アプリの普及によって現実味を帯びつつある。今回紹介されたのは、PCを使わずにスマホ上のローカルLLMにコードを書かせ、そのままスマホ向けのビルド環境でJetpack Composeアプリを組み上げたという実践報告だ。

投稿者は前年の暮れからローカルLLMに関心を持ち、コード生成に特化したQwen2.5-coderを起点にプロンプト制御やパラメータ調整を試行錯誤してきたという。クラウド型AIで構成や仕様の壁打ちを行いながら、最終的な生成はスマホ上のローカルLLMに任せるという二段構えで、Android向けのコードを書かせていったとされる。

背景には、スマホでも動かせる小型LLMの選択肢が増えたことがある。Qwenシリーズのほか、MetaのLlamaやGoogleのGemma、MicrosoftのPhiなど、数B規模のモデルは量子化すれば数GBのメモリで動作し、近年のハイエンド端末なら推論が成立する。端末上での実行にはllama.cppベースのアプリや、Androidなら手軽に試せる各種推論アプリが使われることが多い。クラウドに送らないため、通信環境やプライバシー面での利点があるとされる。

投稿経緯去年の暮れからローカルLLMに興味を持ち、Qwen2.5-coder を起点にプロンプト制御やパラメータ調整の沼にハマっていました。

🏠 Local LLM / Open Models · 本記事のポイント

Jetpack ComposeはGoogleが推進する宣言的UIフレームワークで、Kotlinのコードだけで画面を記述できる。LLMがコード生成を得意とする領域でもあり、UI構造を文章に近い形で出力させやすい点が、今回のような実験と相性がよかった可能性がある。ビルド自体もスマホ単体で行ったとのことで、デスクトップを介さない開発の自己完結性が一つの見どころとなっている。

もっとも、ローカルLLMは大規模なクラウドモデルに比べ生成精度に限界があり、複雑な実装ではプロンプトの作り込みや人手による修正が欠かせないと見られる。それでも、壁打ちをクラウドに、生成をローカルに振り分ける役割分担は、コストやプライバシーを重視する個人開発で一つの実用解になりうる。端末性能とモデル軽量化の進展次第で、こうしたモバイル完結型の開発は今後さらに広がる可能性がある。

A developer's recent write-up documents an unusual experiment: running a local large language model directly on a smartphone to generate code, then compiling and building a Jetpack Compose Android app on the same device. The appeal is straightforward. It shows that a self-contained development loop, from code generation to a working app, is increasingly possible without relying on cloud services or a desktop machine. That matters for privacy-conscious workflows, offline scenarios, and anyone curious about how far on-device inference has come.

The author traces the project back to late last year, when interest in local LLMs led to Qwen2.5-Coder as a starting point. Qwen2.5-Coder is a code-focused model family from Alibaba, available in several sizes and well suited to quantization, which makes the smaller variants feasible to run on consumer hardware. The writer describes falling into the familiar rabbit hole of prompt control and parameter tuning, experimenting with how phrasing, temperature, and similar settings shape output quality. Smaller models are sensitive to these factors, so iteration is part of the work rather than an afterthought.

The workflow that emerged splits the task across two tools. Cloud-based AI is used for what the author calls "prompt sparring," a brainstorming and refinement stage to shape requirements and clarify approach, while the on-device local LLM handles the actual code generation. This hybrid pattern is pragmatic. Cloud models remain stronger for open-ended reasoning, but a phone-resident model can produce code repeatedly without sending data off the device or incurring per-token costs. The split keeps the heavy thinking in the cloud and the routine generation local.

Building Jetpack Compose code is the more notable half of the story. Jetpack Compose is Google's modern declarative UI toolkit for Android, written in Kotlin, where interfaces are defined as composable functions instead of XML layouts. Generating correct Compose code is a reasonable test of a coding model, since it requires up-to-date API knowledge, proper state handling, and idiomatic Kotlin. The fact that this happens on the phone, with the app then compiled by a mobile app rather than Android Studio on a PC, is what makes the demonstration interesting as a proof of concept.

Several adjacent tools make this kind of setup possible. On-device inference typically relies on runtimes such as llama.cpp or MLC LLM, often through GGUF-format quantized weights that shrink models to fit limited memory and bandwidth. Apps like Termux provide a Linux-style environment on Android, and projects exist for compiling Kotlin and assembling APKs locally, though the toolchains remain rougher than desktop equivalents. The post fits a broader trend toward edge AI, alongside Apple's on-device intelligence features, Google's Gemini Nano, and a steady stream of compact open models from Qwen, Llama, and others optimized for phones.

The realistic limits deserve mention. Small quantized models are faster and lighter but generally less accurate than their cloud counterparts, so generated code likely needs review, correction, and several attempts. Phone constraints, including memory, heat, and battery, cap model size and speed, and compiling on mobile is slower and more fragile than on a workstation. The two-stage approach, with cloud assistance for planning, suggests the local model alone was not sufficient for the full task, which is a fair reflection of where the technology stands rather than a flaw in the experiment.

Even so, the project is a useful signpost. It shows that the gap between hobbyist edge AI and practical mobile development is narrowing, and that capable open models combined with on-device runtimes can support real, if modest, creative work. For readers interested in trying something similar, the prerequisites are worth understanding first: quantization and GGUF, a runtime such as llama.cpp, a model like Qwen2.5-Coder sized to the phone's memory, and a tolerance for prompt iteration. The result is unlikely to replace a full desktop setup soon, but it captures a moment where the entire loop of writing, generating, and building software can fit in one pocket-sized device.