Overview
The purpose of the AI Factory collection is to provide a step-by-step process for setting up a simple AI Factory system and getting it up and running quickly, including:
Identifing the minimum hardware and networking requirements for your AI Factory. These baseline specifications also serve as a reference for more advanced deployments. OpenNebula supports high-performance architectures such as InfiniBand, Spectrum-X, and NVLink, although these setups are not automated and require custom configuration.
Follow the step-by-step deployment instructions using OneDeploy to build your AI Factory, with options for both on-premises installations and cloud-based deployments.
Optionally validate your setup using the same methodology we apply during formal infrastructure acceptance. This validation focuses on using AI-ready Kubernetes with NVIDIA Dynamo® or NVIDIA KAI Scheduler®.
Basic Outline
Configuring, deploying and validating a high-performance AI infrastructure using OpenNebula involves these steps:
Familiarize yourself with Architecture and Specifications. We recommend to consult the guide on GPU PCI-passthrough for details relating to your GPU hardware and IOMMU.
Deploy and configure your AI Factory with one of these alternatives:
- On-premises AI Factory Deployment: Set up an AI Factory sing OneDeploy for on-prem environments.
- On-cloud AI Factory Deployment: Set up an AI Factory using OneDeply on Scaleway for cloud environments.
Perform Validation: as a prerequisite, you must have an AI Factory ready to be validated. These are the options to validate your AI Factory:
Validation with direct AI execution:
- Validation with LLM Inferencing: Using vLLM with two different models and two model sizes, running across both H100 and L40S GPUs.
- Validation with NVIDIA Slurm: Finetuning an AI model using the OpenNebula NVIDIA Slurm appliance.
Validation with AI-Ready Kubernetes: Use H100 and L40S deployment to run Kubernetes. Once the AI-ready Kubernetes cluster is up, additional validation steps can be carried out, including:
- Validation with NVIDIA Dynamo: Integrating the GPU-powered Kubernetes cluster with the NVIDIA Dynamo Cloud Platform to provision and manage AI workloads through the Dynamo framework for your AI workloads on top of the NVIDIA Dynamo framework.
- Validation with NVIDIA KAI Scheduler: Use the NVIDIA KAI Scheduler to share GPU resources across different workloads within the AI-ready Kubernetes cluster.
We value your feedback
Was this information helpful?
Glad to hear it
Sorry to hear that