#vllm — TECH Dashboard

Entries page 1/1 · 5 total

Fri, Jun 19 1 entries

blog gemini 1w ago ·

google-cloud-blog

GKE 上の Ray Serve LLM をスケールする: 開発体験を保ちながら高性能を実現 Scaling Ray Serve LLM on GKE: Performance without losing the developer experience

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Gemini / Gemma Medium priority · technical post · Gemini / Gemma 公開 6月19日 Published Jun 19

AI要約 Google Cloud が、Anyscale 製の LLM サービングライブラリ Ray Serve を GKE 上でスケールさせ、スループットとレイテンシを改善する手法を公開。Python ネイティブの開発者体験を維持しながら、本番規模のパフォーマンスを実現するアーキテクチャの知見をまとめた内容だ。

EN Developers looking for LLM inference and model serving often turn to Ray Serve , a scalable model serving library with developer-friendly, Python-native APIs built by Anyscale. Combined with Google Ku

#cloud #google #ray-serve +5

cloud.google.com →

Scaling Ray Serve LLM on GKE: Performance without losing the developer experience

media fallback

Sun, May 31 1 entries

blog local-llm 3w ago ·

qiita-llm

LLM推論を最大2倍高速化するEAGLE 3.1 — attention driftを克服した最新スペキュラティブデコーディング EAGLE 3.1, released May 26 2026, addresses 'attention drift' in speculative decoding and a…

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Local LLM / Open Models Medium priority · technical post · Local LLM / Open Models 公開 5月31日 Published May 31

AI要約 2026年5月26日に公開されたEAGLE 3.1は、スペキュラティブデコーディングの精度低下原因「attention drift」を解消し、vLLM公式ベンチマークでKimi K2.6のスループットを対EAGLE-3比2.03倍に向上させた。

EN EAGLE 3.1, released May 26 2026, addresses 'attention drift' in speculative decoding and achieves up to 2.03× throughput improvement over EAGLE-3 on Kimi K2.6, according to vLLM's official benchmarks.

#llm #qiita #speculative-decoding +4

qiita.com →

fallback

Thu, May 7 1 entries

NEW blog local-llm 1mo ago ·

huggingface-blog

vLLM V0からV1へ:RLにおける修正より正確性を優先 vLLM V0 to V1: Correctness Before Corrections in RL

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Local LLM / Open Models Medium priority · technical post · Local LLM / Open Models 公開 5月7日 Published May 7

AI要約 ServiceNow AIがvLLMをV0からV1へ移行した際、強化学習トレーニングで生じた数値的な不一致と再現性の問題を分析。修正を急ぐ前に、ロジット計算やバッチ処理の正確性を検証する重要性を示した。

原文JA ServiceNow AIがvLLMをV0からV1へ移行した際、強化学習トレーニングで生じた数値的な不一致と再現性の問題を分析。修正を急ぐ前に、ロジット計算やバッチ処理の正確性を検証する重要性を示した。

#huggingface #open-model #vllm +3

huggingface.co →

vLLM V0 to V1: Correctness Before Corrections in RL

og fallback

Tue, Mar 31 1 entries

NEW blog local-llm 2mo ago ·

huggingface-blog

TRL v1.0公開: 進化に追従するポストトレーニングライブラリ TRL v1.0: Post-Training Library Built to Move with the Field

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Local LLM / Open Models Medium priority · technical post · Local LLM / Open Models 公開 3月31日 Published Mar 31

AI要約 Hugging FaceがLLMポストトレーニング用ライブラリTRLのv1.0を公開。SFT/DPO/GRPOなど主要手法を統合し、APIの安定化、vLLM連携、マルチノード分散学習、VLM対応強化など、実運用に耐える成熟版に到達した。

原文JA Hugging FaceがLLMポストトレーニング用ライブラリTRLのv1.0を公開。SFT/DPO/GRPOなど主要手法を統合し、APIの安定化、vLLM連携、マルチノード分散学習、VLM対応強化など、実運用に耐える成熟版に到達した。

#huggingface #open-model #trl +5

huggingface.co →

fallback

Tue, Mar 10 1 entries

NEW blog local-llm 3mo ago ·

huggingface-blog

オープンソースRLライブラリ16種に学ぶ非同期学習の現状 Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Local LLM / Open Models Medium priority · technical post · Local LLM / Open Models 公開 3月10日 Published Mar 10

AI要約 Hugging Faceが16のオープンソース強化学習ライブラリを比較調査し、LLM向けRL訓練における非同期化やトークン生成効率化の課題と設計パターンを整理。スループット向上のための学習・推論分離やオフポリシー対応の動向を解説する。

原文JA Hugging Faceが16のオープンソース強化学習ライブラリを比較調査し、LLM向けRL訓練における非同期化やトークン生成効率化の課題と設計パターンを整理。スループット向上のための学習・推論分離やオフポリシー対応の動向を解説する。

#huggingface #open-model #rlhf +3

huggingface.co →

fallback

#vllm 5 total

Entries page 1/1 · 5 total

GKE 上の Ray Serve LLM をスケールする: 開発体験を保ちながら高性能を実現 Scaling Ray Serve LLM on GKE: Performance without losing the developer experience

LLM推論を最大2倍高速化するEAGLE 3.1 — attention driftを克服した最新スペキュラティブデコーディング EAGLE 3.1, released May 26 2026, addresses 'attention drift' in speculative decoding and a…

vLLM V0からV1へ:RLにおける修正より正確性を優先 vLLM V0 to V1: Correctness Before Corrections in RL

TRL v1.0公開: 進化に追従するポストトレーニングライブラリ TRL v1.0: Post-Training Library Built to Move with the Field

オープンソースRLライブラリ16種に学ぶ非同期学習の現状 Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries