HyperAIHyperAI

Command Palette

Search for a command to run...

Deploy NVIDIA AI-Q Blueprint on Oracle Cloud Infrastructure

NVIDIA has released AI-Q 2.0, an open-source production-ready blueprint that enables developers to deploy advanced multi-agent AI systems on Oracle Cloud Infrastructure. The framework, built upon LangChain Deep Agents and the NVIDIA NeMo Agent Toolkit, addresses the evolution of artificial intelligence from simple single-turn queries to complex, long-horizon workflows capable of planning, delegating tasks, and executing tools within secure sandboxes. AI-Q 2.0 utilizes a modular multi-agent architecture designed for scalability and extensibility. An intent router directs incoming queries to specialized workflows, primarily a Shallow Research Agent for rapid, bounded searches or a Deep Agent for complex analysis. The Deep Agent architecture incorporates dedicated planning and researcher sub-agents that share a filesystem layer and operate within isolated environments. Every component, including underlying models, retrieval-augmented generation backends, and evaluation metrics, can be dynamically reconfigured through YAML configurations or NeMo plugin architecture, ensuring the framework adapts to diverse enterprise requirements. To streamline deployment, NVIDIA partnered with Oracle Cloud Infrastructure to provide a fully automated provisioning process. The blueprint separates infrastructure management from application deployment, utilizing Terraform to establish secure cloud networking, compute clusters, load balancers, and encrypted secret storage. Following infrastructure initialization, which typically requires ten to fifteen minutes, developers leverage Helm to deploy the AI-Q workloads onto an Oracle Kubernetes Engine cluster. This automated pipeline provisions the backend services, frontend interface, and PostgreSQL database without requiring local container builds. The deployment process is optimized for rapid turnaround, taking approximately twenty to twenty-five minutes from initial configuration to operational status. Developers configure tenancy parameters through Terraform variables, provision the cloud resources, and initialize Kubernetes credentials to pull pre-built container images directly from NVIDIA GPU Cloud. Once deployed, the system routes traffic through a cloud load balancer, providing immediate access to the AI-Q interface. Early testing confirms reliable query routing, with the framework efficiently handling both straightforward factual inquiries and complex comparative research tasks requiring multi-step reasoning. NVIDIA emphasizes that this blueprint reduces the operational friction traditionally associated with hosting sophisticated AI agents. By standardizing the infrastructure stack and providing clear decommissioning commands, the solution allows platform engineers to rapidly prototype, scale, and retire advanced agent deployments. The framework is explicitly designed for developers and infrastructure engineers familiar with Kubernetes and infrastructure-as-code methodologies. NVIDIA encourages practitioners to deploy the blueprint, integrate custom tools and models, and share their configurations with the broader developer community through official technical channels.

Related Links