Platform as a Service (PaaS) provider Rafay has extended its Kubernetes management platform to better support enterprise AI and ML workloads, with a focus on GPU resource management, democratizing access to ML pipelines, and assisting with model testing and selection.
The new capabilities make compute resources for AI instantly consumable by developers and data scientists with enterprise-grade guardrails, said Haseeb Budhani, co-founder and CEO of Rafay Systems.
Rafay is a Kubernetes company that helps customers manage their environments, including Kubernetes, CI/CD pipelines and deployment platforms.
Three Gaps
The company noticed customers deploying AI workloads on Kubernetes using Rafay’s product, and identified three gaps they could address, Budhani told The New Stack. The first gap is efficiently consuming and sharing expensive GPU resources. Rafay extended its existing PaaS to provide GPU resources to internal customers, with features like time limits and cost management.
“What we saw happen was our customers were deploying AI workloads on Kubernetes, and using our product to do it, unbeknownst to us,” Budhani said.
The second gap is democratizing access to machine learning (ML) pipelines beyond just data scientists. Rafay introduced an AI/ML workbench on top of their platform to make consuming these pipelines easier for everyone in an enterprise.
The third gap is testing and selecting the best ML models. Rafay added an “LLM playground” layer between the PaaS and ML workbench to allow users to quickly test and select the best models for their needs, Budhani said.
Filling the Gaps
Rafay’s newly added support for GPU workloads helps enterprises and managed service providers power a new GPU-as-a-service experience for internal developers and customers.
Rafay’s new AI Suite provides standards-based pipelines for machine learning operations (MLOps) and large language model operations (LLMOps) to quicken the development and deployment of AI applications.
Moreover, as the global GPU-as-a-service market is expected to reach $17.2 billion by 2030, organizations are seeking scalable solutions to connect their data scientists and developers to accelerated computing infrastructure.
Rafay’s PaaS now addresses issues like environment standardization, self-service consumption of compute, secure use of multitenant environments, cost optimization, and auditability for GPU-based workloads.
“GPU-accelerated workloads are a growing part of enterprise portfolios and organizations need scalable tools to manage them,” said Justin Warren, founder and principal analyst at PivotNine. Customers also want to maintain tight control over the sovereignty of sensitive data, a challenge that is only growing in complexity. It’s good to see Rafay providing enterprises with options beyond the narrow vision of a few major cloud providers.”
The new features for GPU workloads include developer and data scientist self-service, AI-optimized user workspaces, GPU matchmaking, and GPU virtualization.
“Beyond the multicluster matchmaking capabilities and other powerful PaaS features that deliver a self-service compute consumption experience for developers and data scientists, platform teams can also make users more productive with turnkey MLOps and LLMOps capabilities available on the Rafay platform,” Budhani said in a statement. “This announcement makes Rafay a must-have partner for enterprises, as well as GPU and sovereign cloud operators, looking to speed up modern application delivery.”
NTT DATA has been an early user of Rafay’s new AI capabilities and has collaborated with the Rafay team to help deliver its new GPU support and AI Suite to market.
“Rafay’s approach satisfies users responsible for application development and management, making it easy to cross-collaborate within enterprises’ security and budget boundaries,” said Mike Jones, vice president of partners and alliances at NTT DATA, in a statement.
The post Rafay’s PaaS Now Supports GPU Workloads for AI/ML in the Cloud appeared first on The New Stack.
Rafay’s newly added support for GPU workloads helps enterprises and managed service providers power a new GPU-as-a-service experience for internal developers and customers.