#llm-benchmark — TECH Dashboard

paper research 1d ago ·

arxiv-cs-se

TeleResilienceBench: 通信分野におけるLLM推論のレジリエンス定量評価 TeleResilienceBench: Quantifying Resilience for LLM Reasoning in Telecommunications

AI要約通信領域に特化したLLMの推論能力を、ノイズや敵対的入力に対する頑健性の観点から定量化するベンチマーク「TeleResilienceBench」を提案する研究。摂動下での性能劣化を測定し、通信分野でのLLM活用における信頼性課題を浮き彫りにする。

EN This paper introduces TeleResilienceBench, a benchmark for quantifying the resilience of LLM reasoning in telecommunications. It evaluates how models degrade under noisy or adversarial inputs, exposing reliability gaps for telecom-specific deployments.

#arxiv #paper #llm-benchmark #telecom

arxiv.org →

#llm-benchmark page 1/1 · 1 total

TeleResilienceBench: 通信分野におけるLLM推論のレジリエンス定量評価 TeleResilienceBench: Quantifying Resilience for LLM Reasoning in Telecommunications