Build Secure Always-On Local AI with OpenClaw and NVIDIA NemoClaw
NVIDIA has introduced NemoClaw, an open-source reference stack designed to transform AI agents from simple question-and-answer systems into secure, always-on autonomous assistants. Unlike traditional cloud-based deployments that raise data privacy concerns, NemoClaw enables users to run local, sandboxed agents on hardware like the NVIDIA DGX Spark. This setup allows agents to read files, call APIs, and execute multi-step workflows without external dependencies, ensuring all inference and data processing remain on-premises. The NemoClaw architecture relies on a layered approach to security and functionality. At the core is OpenClaw, a multi-channel agent framework that manages chat platforms, memory, and tool integration. This framework operates within OpenShell, a security runtime that enforces safety boundaries through sandboxing, credential management, and network proxying. OpenShell acts as a "walled garden," preventing agents from accessing sensitive information or unrestricted web resources unless explicitly approved. The system is powered by the NVIDIA Nemotron 3 Super 120B model, a large language model optimized for complex reasoning and high instruction-following capabilities, deployed locally via Ollama or NVIDIA NIM. Deploying the stack requires approximately 20 to 30 minutes of setup, followed by an initial model download of roughly 87 GB. The process begins by configuring the NVIDIA container runtime and setting up Ollama to listen on all network interfaces, ensuring the sandboxed agent can communicate with the local inference server. Once the Nemotron 3 model is downloaded and loaded into GPU memory, users install NemoClaw, which triggers an onboarding wizard. During this wizard, users define a sandbox name, select the local Ollama inference provider, and apply default security policies that restrict filesystem and network access. The tutorial details how to integrate the agent with Telegram for remote accessibility. Users create a bot via the Telegram BotFather and register the API token during the NemoClaw onboarding or by rerunning the wizard. Once paired, the agent can be controlled from any device running the Telegram client. The system generates a response time of 30 to 90 seconds per query due to the size of the 120B model, which is expected for local inference. A key feature of NemoClaw is its dynamic policy approval system. By default, the sandbox blocks external network requests such as fetching webpages or calling third-party APIs. If an agent attempts such an action, OpenShell intercepts the request and displays it in a terminal user interface. The administrator can then approve the specific connection for the current session or permanently add the endpoint to the allowed policy list. This provides real-time visibility and control over agent capabilities without needing to restart the sandbox or modify base configurations. Management of the deployed agent is handled through a set of command-line tools. Users can connect to the sandbox, check status, stream logs, and start or stop auxiliary services like the Telegram bridge. For remote access, users can configure SSH tunnels to forward the local Web UI port, allowing interaction via a browser from other machines. While the setup offers robust isolation, NVIDIA notes that no sandbox provides absolute protection against advanced prompt injection attacks, recommending deployment on isolated systems for testing new tools. The complete code, documentation, and a detailed deployment playbook are available on GitHub and the NVIDIA Build website for developers looking to build secure, self-hosted AI agents.
