HyperAI

NVIDIA has unveiled its latest advancements in desktop AI computing at CES, introducing the DGX Spark and DGX Station systems designed to empower developers with powerful, local AI capabilities. These deskside supercomputers enable users to run cutting-edge open-source and frontier AI models directly from their desks, from 100-billion-parameter models on DGX Spark to trillion-parameter models on DGX Station. Built on the NVIDIA Grace Blackwell architecture, both systems deliver petaflop-level AI performance with large unified memory. The DGX Spark features the NVFP4 data format, which compresses AI models by up to 70% while preserving intelligence, significantly boosting efficiency and speed. This advancement, combined with optimizations from the open-source community—including collaborations with llama.cpp—delivers a 35% average performance uplift when running state-of-the-art models. DGX Spark is preconfigured with NVIDIA AI software and CUDA-X libraries, offering developers a plug-and-play environment for building, fine-tuning, and deploying AI models. It supports the latest frameworks and open-source models such as NVIDIA’s Nemotron 3 series, Kimi-K2 Thinking, DeepSeek-V3.2, Mistral Large 3, Meta Llama 4 Maverick, Qwen3, and OpenAI’s gpt-oss-120b. DGX Station takes performance further with the GB300 Grace Blackwell Ultra superchip and 775GB of coherent memory in FP4 precision, enabling the execution of models up to 1 trillion parameters. This makes it ideal for enterprise research labs and frontier AI development. Developers can now test and optimize advanced frameworks like vLLM directly on the GB300 superchip in a compact, single-system form factor—something previously only possible in data centers. Industry leaders are already embracing the shift to local AI. Kaichao You, core maintainer of vLLM, noted that DGX Station enables faster development cycles by allowing direct testing of GB300-specific features at lower cost. Jerry Zhou, contributor to SGLang, praised the system’s ability to serve massive models like Qwen3-235B locally and develop CUDA kernels with large matrix operations without relying on cloud infrastructure. At CES, NVIDIA demonstrated how DGX Spark can accelerate video generation workloads—delivering up to 8x faster performance than a top-tier MacBook Pro with M4 Max—freeing up creative workstations for uninterrupted design. The platform also supports new AI models like FLUX.2 and FLUX.1 from Black Forest Labs, Qwen-Image from Alibaba, and LTX-2 from Lightricks, all optimized for NVIDIA GPUs with NVFP8 quantization. The open-source RTX Remix modding platform is set to integrate with DGX Spark, allowing 3D artists to offload asset creation and mod in real time. NVIDIA also showcased a local CUDA coding assistant powered by Nsight, enabling secure, AI-enhanced development without exposing source code to the cloud. Leaders across industries are validating the move to edge AI. Jeff Boudier of Hugging Face highlighted how DGX Spark enables embodied AI agents using the Reachy Mini robot, bringing interactive, voice-enabled AI into the physical world. IBM’s Ed Anuff emphasized the value of OpenRAG on DGX Spark for secure, self-contained retrieval-augmented generation workflows. JetBrains CEO Kirill Skrygan noted that DGX Spark offers petaflop-class performance for developers who prioritize data security and IP control. TRINITY, an intelligent three-wheeled urban mobility vehicle, will be on display at CES, powered by DGX Spark as its AI brain for real-time vision-language model inference. will.i.am described it as “brains on wheels,” capable of conversational, goal-tracking interactions in smart cities. To accelerate adoption, NVIDIA expanded its DGX Spark playbook library with six new guides and four major updates covering topics like Nemotron 3 Nano, robotics training, vision-language models, fine-tuning across dual systems, genomics, and financial analysis. Additional playbooks for DGX Station and GB300 systems will follow later in 2026. NVIDIA AI Enterprise software support is now available for DGX Spark and GB10 systems through partners including Acer, Amazon, ASUS, Dell Technologies, GIGABYTE, HP Inc., Lenovo, Micro Center, MSI, and PNY. DGX Station will be available from ASUS, Boxx, Dell Technologies, GIGABYTE, HP Inc., MSI, and Supermicro starting in spring 2026. Licensing for AI Enterprise software is expected by the end of January.

Related Links

Related Links

Related Links

When Multimodal Computing Begins to Take Off: MiniCPM-o-4.5, With Only 9 Bytes, Covers real-time Image Understanding and Text Generation; vLLM Omni Simultaneously Supports high-throughput Deployment and service-oriented Architecture for Both Text and Multimodal models.

When Multimodal Computing Begins to Take Off: MiniCPM-o-4.5, With Only 9 Bytes, Covers real-time Image Understanding and Text Generation; vLLM Omni Simultaneously Supports high-throughput Deployment and service-oriented Architecture for Both Text and Multimodal models.

Command Palette

NVIDIA DGX Spark and DGX Station Bring Data-Center-Grade AI to the Desktop, Enabling Local Development of Massive Open-Source Models with Unprecedented Performance and Scalability

Related Links

Command Palette

NVIDIA DGX Spark and DGX Station Bring Data-Center-Grade AI to the Desktop, Enabling Local Development of Massive Open-Source Models with Unprecedented Performance and Scalability

Related Links

Command Palette

NVIDIA DGX Spark and DGX Station Bring Data-Center-Grade AI to the Desktop, Enabling Local Development of Massive Open-Source Models with Unprecedented Performance and Scalability

Related Links

When Multimodal Computing Begins to Take Off: MiniCPM-o-4.5, With Only 9 Bytes, Covers real-time Image Understanding and Text Generation; vLLM Omni Simultaneously Supports high-throughput Deployment and service-oriented Architecture for Both Text and Multimodal models.

When Multimodal Computing Begins to Take Off: MiniCPM-o-4.5, With Only 9 Bytes, Covers real-time Image Understanding and Text Generation; vLLM Omni Simultaneously Supports high-throughput Deployment and service-oriented Architecture for Both Text and Multimodal models.