NVIDIA Open Sources KAI Scheduler To Help AI Teams Optimize GPU Utilization

At KubeCon Europe, NVIDIA announced today that it is open sourcing KAI Scheduler, a GPU-centric Kubernetes scheduler that was originally developed by Run:ai, which NVIDIA acquired last year. Available under the Apache 2.0 license, KAI Scheduler helps its users optimize GPU resource allocations for AI and machine learning workloads in GPU clusters.

NVIDIA argues that traditional resource schedulers are ill-suited for managing AI workloads because GPU demand can fluctuate quite a lot, with bursty inference workloads and sustained model training runs that can extend over days.

KAI Scheduler promises to give these teams a better tool for managing these workloads by, among other things, dynamically adjusting quotas and limits in real time, while also offering a variety of scheduling strategies — gang scheduling, hierarchical queuing, bin-packing, spreading and GPU sharing — to avoid long wait times for access to GPUs.

Sharing GPUs looks like it will be an especially useful feature here. This allows multiple pods to utilize the same GPU, for example. It’s worth noting that NVIDIA already offers a tool called GPU Operator, a Kubernetes framework for provisioning GPUs, which also includes a GPU time-slicing feature.

GPU Operator, however, is very much focused on working with NVIDIA hardware and large clusters (including NVIDIA’s own DGX racks), while KAI Scheduler is more vendor-agnostic and also supports AI workloads on CPUs.

KAI Scheduler’s approach, other than GPU sharing, focuses on the individual GPUs and the memory available to them. What developers can reserve here is a share of that memory. There is no memory isolation, though.

By default, KAI Scheduler integrates with popular AI tools and cloud native frameworks like Kubeflow’s Training Operator, Ray and Argo.

The code and documentation for KAI Scheduler is now available on GitHub. Quite a few other parts of Run:ai are already open source, too, including the somewhat related Genv GPU environment and cluster management tools.

The post NVIDIA Open Sources KAI Scheduler To Help AI Teams Optimize GPU Utilization appeared first on The New Stack.

NVIDIA today open sources Run:ai's KAI Scheduler, a project that helps AI teams optimize GPU resource allocations in Kubernetes clusters.

NVIDIA Open Sources KAI Scheduler To Help AI Teams Optimize GPU Utilization

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112