Local LLM / Open Models ⚠ 古い情報の可能性

ローカルLLMは「ラ・サール中」の難問を解けるのか？最新モデル比較で見えた驚きの知能差 A benchmark blog post testing multiple local LLMs on La Salle Junior High School entrance …

Qiita LLM tag · qiita.com · 2026/06/07 12:38 · 1w ago · 📖 1 min

元記事を読む古い情報の可能性

AI 3 行サマリ

ラ・サール中学の入試算数問題を複数の最新ローカルLLMに解かせ、モデルごとの推論能力の差を比較検証した記事。
期待に反し、モデル間で「驚きの知能差」が浮き彫りになったとされる。

English summary

A benchmark blog post testing multiple local LLMs on La Salle Junior High School entrance exam math problems, revealing surprising reasoning capability gaps among the compared models.

ローカルLLMが「受験算数」という現実的なベンチマークでどこまで通用するかを検証した記事。難関私立のラ・サール中学の入試問題を題材に、複数の最新ローカルモデルを比較し、推論能力の差を明らかにしている。

「最新AIなら中学入試程度は造作もなく解ける」という楽観的な前提に疑問を呈する構成で、実際の結果は「驚きの知能差」があったと報告している。クラウドAPIではなくローカル動作モデルのみが対象とみられ、オープンウェイトモデル同士の実用比較として位置づけられる。

具体的なモデル名・スコア・評価手法の詳細はソース記事での確認が必要。ただし、日本語の多段推論という観点から、国内向けローカルLLM選定の参考資料となりえる内容だ。

This Qiita blog post challenges the assumption that modern AI should easily handle problems at the level of Japanese middle school entrance exams. The author uses actual math problems from La Salle Junior High School—one of Japan's most selective preparatory schools—as a practical benchmark, testing multiple recent local LLMs and comparing their performance across problems.

The focus on locally run models rather than cloud APIs suggests the comparison targets open-weight models deployable on consumer or prosumer hardware. Japanese entrance-exam arithmetic (受験算数) is a meaningful reasoning benchmark because problems typically require multi-step logical deduction and creative problem decomposition rather than rote calculation, making them a culturally grounded stress test for Japanese-language AI reasoning.

The summary indicates the results revealed "surprising gaps in intelligence" between models, but specific model names, scores, and evaluation methodology are not available from the collected context alone and should be verified at the source URL. Given how rapidly the local LLM landscape evolves, readers should also check whether the tested model versions remain current.