#throughput — TECH Dashboard

Entries page 1/1 · 2 total

Fri, Jun 19 1 entries

blog gemini 1w ago ·

google-cloud-blog

GKE 上の Ray Serve LLM をスケールする: 開発体験を保ちながら高性能を実現 Scaling Ray Serve LLM on GKE: Performance without losing the developer experience

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Gemini / Gemma Medium priority · technical post · Gemini / Gemma 公開 6月19日 Published Jun 19

AI要約 Google Cloud が、Anyscale 製の Python ネイティブな LLM サービングライブラリ Ray Serve を GKE 上でスケールさせ、スループットとレイテンシを最適化する手法を解説。開発者体験を損なわずに本番規模の推論性能を実現するアーキテクチャの知見を共有している。

EN Google Cloud explains how to scale Ray Serve LLM on GKE for better throughput and latency, achieving production-grade inference performance while preserving its developer-friendly, Python-native experience.

#cloud #google #ray-serve +9

cloud.google.com →

Scaling Ray Serve LLM on GKE: Performance without losing the developer experience

media fallback

Tue, Mar 10 1 entries

NEW blog local-llm 3mo ago ·

huggingface-blog

オープンソースRLライブラリ16種に学ぶ非同期学習の現状 Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

重要度 Medium Medium priority 重要度 Medium · 技術記事 · Local LLM / Open Models Medium priority · technical post · Local LLM / Open Models 公開 3月10日 Published Mar 10

AI要約 Hugging Faceが16のオープンソース強化学習ライブラリを比較し、LLM向けRL訓練の非同期化やトークン生成効率化の課題を整理。学習と推論の分離やオフポリシー対応でスループットを高める設計パターンを解説する。

EN Hugging Face surveys 16 open-source RL libraries, mapping out how each tackles async training, throughput, and off-policy support to keep tokens flowing. It distills design patterns for separating training and inference in LLM RL workflows.

#huggingface #open-model #rlhf +7

huggingface.co →

fallback

#throughput 2 total

Entries page 1/1 · 2 total

GKE 上の Ray Serve LLM をスケールする: 開発体験を保ちながら高性能を実現 Scaling Ray Serve LLM on GKE: Performance without losing the developer experience

オープンソースRLライブラリ16種に学ぶ非同期学習の現状 Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries